文章作者:Tyan
博客:noahsnail.com
Chapter 1 Introduction
THIS book is designed to help you make the most effective use of the JavaTM programming language and its fundamental libraries, java.lang
, java.util
, and, to a lesser extent, java.util.concurrent
and java.io
. The book discusses other libraries from time to time, but it does not cover graphical user interface programming, enterprise APIs, or mobile devices.
本书的目的是为了帮助你最有效的利用Java编程语言和它的基础库,java.lang
,java.util
,在更小程度上包括java.util.concurrent
和java.io
。本书有时会讨论其它的库,但不包括图形用户接口编程,企业APIs或移动设备。
This book consists of seventy-eight items, each of which conveys one rule. The rules capture practices generally held to be beneficial by the best and most experienced programmers. The items are loosely grouped into ten chapters, each concerning one broad aspect of software design. The book is not intended to be read from cover to cover: each item stands on its own, more or less. The items are heavily cross-referenced so you can easily plot your own course through the book.
本书包括七十八个条目,每个条目传达一条规则。这些规则通常是从实践中得到并且最好最有经验的程序员坚信它是有益的。这些条目被松散的分为十章,每章都是关于软件设计方面的一个扩展。本书不打算被从头到尾的读,每个条目或多或少都是依赖于它本身。这些条目之间的交叉引用非常严重,因此你可以很容易的通过本书划分自己的进度。
Many new features were added to the platform in Java 5 (release 1.5). Most of the items in this book use these features in some way. The following table shows you where to go for primary coverage of these features:
Java 5平台增加了许多新功能。本书中的大多数条目在某种程度上使用了这些功能。下表列出了这些新功能在本书中的位置:
Most items are illustrated with program examples. A key feature of this book is that it contains code examples illustrating many design patterns and idioms. Where appropriate, they are cross-referenced to the standard reference work in this area [Gamma95].
大多数条目通过程序实例进行说明。本书的一个重要特点是它包含了说明许多设计模式和习惯用法的代码实例。这些条目放在哪里是合适的,它们被交叉参考引用到了这个领域的标准参考著作[Gamma 95]。
Many items contain one or more program examples illustrating some practice to be avoided. Such examples, sometimes known as antipatterns, are clearly labeled with a comment such as “// Never do this!” In each case, the item explains why the example is bad and suggests an alternative approach.
许多条目包含一个或多个用来表明一些应该在实践中避免的程序实例。这些例子中的都加上了清楚的注释例如“// Never do this!”,有时候这些例子也被称为反模式。在每一个例子中,这个条目都解释了为什么这个例子是不好的,并且提建议了一种可替代方法。
This book is not for beginners: it assumes that you are already comfortable with the Java programming language. If you are not, consider one of the many fine introductory texts [Arnold05, Sestoft05]. While the book is designed to be accessible to anyone with a working knowledge of the language, it should provide food for thought even for advanced programmers.
本书不是给初学者的:它假定你已经非常熟悉Java编程语言。如果你对Java语言不熟悉,请考虑许多很好的入门书籍中的一本[Arnold05, Sestoft05]。虽然本书的目标是任何具有实际Java编程经验的人,但它应该能提供一些思考的东西,即使是对于高级程序员。
Most of the rules in this book derive from a few fundamental principles. Clarity and simplicity are of paramount importance. The user of a module should never be surprised by its behavior. Modules should be as small as possible but no smaller. (As used in this book, the term module refers to any reusable software component, from an individual method to a complex system consisting of multiple packages.) Code should be reused rather than copied. The dependencies between modules should be kept to a minimum. Errors should be detected as soon as possible after they are made, ideally at compile time.
本书中的大多数规则源于一些基本的原则。简洁清晰是最重要的。模块的用户不应该对它的行为感到惊奇。模块要尽可能的小但不是更小。(本书中使用的术语模块指的是任何可复用的软件组件,从单个方法到由多个包组成的复杂系统)。代码应该被复用而不是拷贝。模块间的依赖性要保持最小。错误应该尽早检测出来,理想情况是在编译时发现。
While the rules in this book do not apply 100 percent of the time, they do characterize best programming practices in the great majority of cases. You should not slavishly follow these rules, but violate them only occasionally and with good reason. Learning the art of programming, like most other disciplines, consists of first learning the rules and then learning when to break them.
虽然本书中的规则不能百分百的应用于任何时间,但在大多数情况下具有最好编程实践的特征。你不应该盲从这些规则,但只是偶尔在有充足的理由的时候才违反这些规则。像大多数其它学科一样,学习编程艺术包括首先学习规则,然后学习在什么时候打破规则。
For the most part, this book is not about performance. It is about writing programs that are clear, correct, usable, robust, flexible, and maintainable. If you can do that, it’s usually a relatively simple matter to get the performance you need (Item 55). Some items do discuss performance concerns, and a few of these items provide performance numbers. These numbers, which are introduced with the phrase “On my machine,” should be regarded as approximate at best.
本书的大部分不是关于性能的。它是关于编写清晰、正确、可用、鲁棒、有弹性并且可维护的程序的。如果你能做到这一点,要得到你需要的性能它通常是相对简单的(条目55)。一些条目讨论性能的关注点,这些条目中的一些提供了性能指数。这些指数应该被看做与最好情况下近似,这些指数介绍时使用了词语”在我的机器上”。
For what it’s worth, my machine is an aging homebuilt 2.2 GHz dual-core AMD Opteron
值得注意的是,我的机器是老旧的组装电脑,2.2G赫兹双核AMD 皓龙处理器 170,2G内存,在微软的Windows XP SP2上运行Sun的JDK 1.6_05版本。JDK有两个虚拟机,Java热交换客户端和服务器虚拟机。性能指标是在服务器虚拟机上测量的。
When discussing features of the Java programming language and its libraries, it is sometimes necessary to refer to specific releases. For brevity, this book uses “engineering version numbers” in preference to official release names. This table shows the mapping between release names and engineering version numbers.
当讨论Java编程语言的特性和它的库时,有时指明特定的版本是必要的。为了简洁,本书使用工程版本号而不是正式的发行名称。下表显示了发行名称与工程版本号的映射关系。
The examples are reasonably complete, but they favor readability over completeness. They freely use classes from the packages java.util
and java.io
. In order to compile the examples, you may have to add one or more of these import statements:
虽然这些例子是相当完整的,但它们注重可读性甚于完整性。他们可以很自由的使用包java.util
和java.io
中的类。为了编译这些例子,你可能必须添加一个或多个导入声明:
1 | import java.util.*; |
Other boilerplate is similarly omitted. The book’s Web site, http://java.sun.com/docs/books/effective, contains an expanded version of each example, which you can compile and run.
其它的例子中也有类似的省略情况。本书的网站:http://java.sun.com/docs/books/effective,含有每个例子的扩展版本,你可以编译并且运行。
For the most part, this book uses technical terms as they are defined in The Java Language Specification, Third Edition [JLS]. A few terms deserve special mention. The language supports four kinds of types: interfaces (including annotations), classes (including enums), arrays, and primitives. The first three are known as reference types. Class instances and arrays are objects; primitive values are not. A class’s members consist of its fields, methods, member classes, and member interfaces. A method’s signature consists of its name and the types of its formal parameters; the signature does not include the method’s return type.
本书中的大部分技术术语与Java语言规范(第三版)中的术语是一样的。一些术语需要特别指出。Java语言支持四种类型:接口(包括注解),类(包括枚举),数组和基本类型。前三个是引用类型。类实例和数组是对象,基本类型不是。类成员由它的域、方法、成员类和成员接口组成。方法的签名由它的名字、正式的参数类型组成;签名不包括方法的返回值类型。
This book uses a few terms differently from the The Java Language Specification. Unlike The Java Language Specification, this book uses inheritance as a synonym for subclassing. Instead of using the term inheritance for interfaces, this book simply states that a class implements an interface or that one interface extends another. To describe the access level that applies when none is specified, this book uses the descriptive term package-private instead of the technically correct term default access [JLS, 6.6.1].
本书使用了一些与Java语言规范不同的术语。不像Java语言规范,本书使用继承作为子类的同义词。不再使用接口继承的术语,本书简单表述一个类实现了一个接口或一个接口扩展了另一个接口。为了描述没有指定访问级别的情况,本书使用描述术语包私有代替技术上正确的术语缺省访问[JLS, 6.6.1].
This book uses a few technical terms that are not defined in The Java Language Specification. The term exported API, or simply API, refers to the classes, interfaces, constructors, members, and serialized forms by which a programmer accesses a class, interface, or package. (The term API, which is short for application programming interface, is used in preference to the otherwise preferable term interface to avoid confusion with the language construct of that name.) A programmer who writes a program that uses an API is referred to as a user of the API. A class whose implementation uses an API is a client of the API.
本书使用一些Java语言规范没有定义的术语。术语exported API
或simply API
,指的是类、接口、构造函数、成员、序列化形式,程序员通过它们访问类、接口或包。(术语API,是应用程序接口的缩写,优先使用API而不是其他人更喜欢的术语接口,是为了避免与Java语言中的接口相混淆。)程序员写程序使用API指的是API的用户。类中实现使用了API的称为API的客户。
Classes, interfaces, constructors, members, and serialized forms are collectively known as API elements. An exported API consists of the API elements that are accessible outside of the package that defines the API. These are the API elements that any client can use and the author of the API commits to support. Not coincidentally, they are also the elements for which the Javadoc utility generates documentation in its default mode of operation. Loosely speaking, the exported API of a package consists of the public and protected members and constructors of every public class or interface in the package.
类、接口、构造函数、成员和序列化形式统称为API元素。导出API由定义API的包的包外能访问的API元素组成。这些API元素是任何客户都能使用的并且API的作者提供支持。无独有偶,Java工具类默认操作模式下也为这些元素产生了文档。不严格的说,包的导出API由公有成员、保护成员和每个公有类的构造函数或包中的接口组成。
Chapter 2 Creating and Destroying Objects
THIS chapter concerns creating and destroying objects: when and how to create them, when and how to avoid creating them, how to ensure they are destroyed in a timely manner, and how to manage any cleanup actions that must precede their destruction.
这章是关于创建和销毁对象的:什么时候怎样创建它们,什么时候怎样避免创建它们,怎样确保它们被及时的销毁,怎么管理任何清理操作,清理操作必须在对象销毁之前。
Item 1: Consider static factory methods instead of constructors
Item 1: 考虑用静态工厂方法代替构造函数
The normal way for a class to allow a client to obtain an instance of itself is to provide a public constructor. There is another technique that should be a part of every programmer’s toolkit. A class can provide a public static factory method
, which is simply a static method that returns an instance of the class. Here’s a simple example from Boolean (the boxed primitive class for the primitive type boolean). This method translates a boolean primitive value into a Boolean object reference:
一个类允许客户获得它本身的一个实例通常的方式是提供一个公有的构造函数。还有另一种技术应该成为每个程序员工具箱中的一部分。一个类可以提供一种公有的static factory method
,static factory method
是一种简单的静态方法,它会返回一个类的实例。这有一个来自Boolean(基本类型boolean的封装类)的简单例子。这个方法将一个布尔值转成Boolean对象的引用:
1 | public static Boolean valueOf(boolean b) { |
Note that a static factory method is not the same as the Factory Method
pattern from Design Patterns
[Gamma95, p. 107]. The static factory method described in this item has no direct equivalent in Design Patterns.
注意静态工厂方法与Design Patterns
中的Factory Method
是不同的。这个条目中描述的静态工厂方法与设计模式中的工厂方法是不等价的。
A class can provide its clients with static factory methods instead of, or in addition to, constructors. Providing a static factory method instead of a public constructor has both advantages and disadvantages.
一个类可以为它的客户提供静态工厂方法来代替构造函数,或者除了构造函数之外再提供一个静态工厂方法。提供静态工厂方法代替公有构造函数既有优点也有缺点。
One advantage of static factory methods is that, unlike constructors, they have names. If the parameters to a constructor do not, in and of themselves, describe the object being returned, a static factory with a well-chosen name is easier to use and the resulting client code easier to read. For example, the constructor BigInteger(int, int, Random)
, which returns a BigInteger
that is probably prime, would have been better expressed as a static factory method named BigInteger.probablePrime
. (This method was eventually added in the 1.4 release.)
与构造函数相比,静态工厂方法的第一个优势是它们有名字。如果构造函数的参数本身不能描述返回的对象,具有合适名字的静态工厂是更容易使用的,并且产生的客户端代码更易读。例如,构造函数BigInteger(int, int, Random)
返回一个BigInteger
,这个BigInteger
可能是一个素数,使用名字为BigInteger.probablePrime
的静态工厂方法来表示会更好。(这个方法最终在1.4版本被引入。)
A class can have only a single constructor with a given signature. Programmers have been known to get around this restriction by providing two constructors whose parameter lists differ only in the order of their parameter types. This is a really bad idea. The user of such an API will never be able to remember which constructor is which and will end up calling the wrong one by mistake. People reading code that uses these constructors will not know what the code does without referring to the class documentation.
一个类只能有一个具有指定签名的构造函数。程序员知道怎样规避这个限制:通过提供两个构造函数,它们仅在参数列表类型的顺序上有所不同。这真的是一个坏主意。使用这种API的用户永远不能记住哪一个构造函数是哪一个,最后会无意中调用错误的构造函数。使用这些构造函数的人在读代码时如果没有类的参考文档将不知道代码要做什么。
Because they have names, static factory methods don’t share the restriction discussed in the previous paragraph. In cases where a class seems to require multiple constructors with the same signature, replace the constructors with static factory methods and carefully chosen names to highlight their differences.
因为静态工厂方法有名字,因此它们不会有上一段讨论的那种限制。当一个类似乎需要多个具有相同签名的构造函数时,用静态工厂方法代替构造函数,通过仔细选择工厂方法的名字来突出它们的不同。
A second advantage of static factory methods is that, unlike constructors, they are not required to create a new object each time they’re invoked. This allows immutable classes (Item 15) to use preconstructed instances, or to cache instances as they’re constructed, and dispense them repeatedly to avoid creating unnecessary duplicate objects. The Boolean.valueOf(boolean)
method illustrates this technique: it never creates an object. This technique is similar to the Flyweight pattern [Gamma95, p. 195]. It can greatly improve performance if equivalent objects are requested often, especially if they are expensive to create.
与构造函数相比,静态工厂方法的第二个优势是当调用静态工厂方法时不要求每次都创建一个新的对象。这允许不可变类(Item 15)使用预创建的实例,或缓存构建好的实例,通过重复分发它们避免创建不必要的重复对象。Boolean.valueOf(boolean)
方法阐明了这个技术:它从未创建对象。这项技术与Flyweight模式类似[Gamma95, p. 195]。如果经常请求相同的对象,它能极大的提升性能,尤其是在创建对象的代价较昂贵时。
The ability of static factory methods to return the same object from repeated invocations allows classes to maintain strict control over what instances exist at any time. Classes that do this are said to be instance-controlled. There are several reasons to write instance-controlled classes. Instance control allows a class to guarantee that it is a singleton (Item 3) or noninstantiable (Item 4). Also, it allows an immutable class (Item 15) to make the guarantee that no two equal instances exist: a.equals(b)
if and only if a==b
. If a class makes this guarantee, then its clients can use the ==
operator instead of the equals(Object)
method, which may result in improved performance. Enum types (Item 30) provide this guarantee.
静态工厂方法能从重复的调用中返回相同的对象,在任何时候都能使类严格控制存在的实例。这些类被称为控制实例。编写控制实例类是有一些原因的。实例控制允许一个类保证它是一个单例(Item 3)或不可实例化的(Item 4)。它也允许一个不变的类(Item 15)保证不存在两个相等的实例:a.equals(b)
当且仅当a==b
。如果一个类保证了这一点,它的客户端可以使用==
操作符代替equals(Object)
方法,这可能会导致性能的提升。Enum类型(Item 30)保证了这一点。
A third advantage of static factory methods is that, unlike constructors, they can return an object of any subtype of their return type. This gives you great flexibility in choosing the class of the returned object.
与构造函数相比,静态工厂方法的第三个优势是它们能返回它们的返回类型的任意子类型的对象。这样在选择返回对象的类时有了更大的灵活性。
One application of this flexibility is that an API can return objects without making their classes public. Hiding implementation classes in this fashion leads to a very compact API. This technique lends itself to interface-based frameworks (Item 18), where interfaces provide natural return types for static factory methods.Interfaces can’t have static methods, so by convention, static factory methods for an interface named Type
are put in a noninstantiable class (Item 4) named Types
.
灵活性的一个应用是API能返回对象而不必使它们的类变成公有的。通过这种方式中隐藏实现类会有一个更简洁的API。这项技术适用于基于接口的框架(Item 18),接口为静态工厂方法提供了自然的返回类型。接口不能有静态方法,因此按惯例,命名为Type
的接口的静态工厂方法被放在一个命名为Types
的不可实例化的类中(Item 4)。
For example, the Java Collections Framework has thirty-two convenience implementations of its collection interfaces, providing unmodifiable collections, synchronized collections, and the like. Nearly all of these implementations are exported via static factory methods in one noninstantiable class (java.util.Collections
). The classes of the returned objects are all nonpublic.
例如,Java集合框架有三十二个集合接口的便利实现,提供了不可修改的集合,同步集合等等。几乎所有的这些实现都是通过静态工厂方法导出在一个不可实例化的类中(java.util.Collections
)。返回对象的类都是非公有的。
The Collections Framework API is much smaller than it would have been had it exported thirty-two separate public classes, one for each convenience implementation. It is not just the bulk of the API that is reduced, but the conceptual weight. The user knows that the returned object has precisely the API specified by its interface, so there is no need to read additional class documentation for the implementation classes. Furthermore, using such a static factory method requires the client to refer to the returned object by its interface rather than its implementation class, which is generally good practice (Item 52).
集合框架API比它导出的三十二个分开的公有类更小,每一个便利实现对应一个类。它不仅仅是API的数量在减少,还是概念上意义上的减少。用户知道返回的对象含有接口指定的精确API,因此不需要阅读额外的实现类的文档。此外,使用这样的静态工厂方法需要客户端使用接口引用返回的对象而不是使用它的实现类,这通常是最佳的实践(Item 52)。
Not only can the class of an object returned by a public static factory method be nonpublic, but the class can vary from invocation to invocation depending on the values of the parameters to the static factory. Any class that is a subtype of the declared return type is permissible. The class of the returned object can also vary from release to release for enhanced software maintainability and performance.
不仅公有静态工厂方法返回对象的类可以是非公有的,而且这个类还可以随着调用静态工厂时输入的参数值的变化而变化。声明的返回值类型的任何子类都是可以的。为了增强软件的可维护性及性能,返回值对象的类也可以随着发布版本的变化而变化。
The class java.util.EnumSet
(Item 32), introduced in release 1.5, has no public constructors, only static factories. They return one of two implementations, depending on the size of the underlying enum type: if it has sixty-four or fewer elements, as most enum types do, the static factories return a RegularEnumSet
instance, which is backed by a single long
; if the enum type has sixty-five or more elements, the factories return a JumboEnumSet
instance, backed by a long array.
在1.5版本中引入类java.util.EnumSet
(Item 32),它没有公有的构造函数,只有静态工厂方法。根据枚举类型的大小,静态工厂方法返回两个实现中的一个,枚举类型的分类:如果枚举类型中有六十四个元素或更少,与大多数枚举类型一样,静态工厂返回一个RegularEnumSet
实例,由单个的long
支持;如果枚举类型中有六十五个元素或更多,静态工厂方法返回一个JumboEnumSet
实例,由long[]
支持。
The existence of these two implementation classes is invisible to clients. If RegularEnumSet
ceased to offer performance advantages for small enum types, it could be eliminated from a future release with no ill effects. Similarly, a future release could add a third or fourth implementation of EnumSet
if it proved beneficial for performance. Clients neither know nor care about the class of the object they get back from the factory; they care only that it is some subclass of EnumSet
.
现有的两个实现类对于客户端是不可见的。如果RegularEnumSet
对于较少数量的枚举类型没有提供性能优势,那么在将来的版本中将其移除不会任何影响。同样地,如果新的EnumSet
实现在性能上更有优势,在将来的版本中添加EnumSet
的第三或第四个实现也不会有任何影响。客户端不知道也不关心它们从工厂方法中得到的对象所属的类;它们只关心它是EnumSet
的某个子类。
The class of the object returned by a static factory method need not even exist at the time the class containing the method is written. Such flexible static factory methods form the basis of service provider frameworks, such as the Java Database Connectivity API (JDBC). A service provider framework is a system in which multiple service providers implement a service, and the system makes the implementations available to its clients, decoupling them from the implementations.
在编写静态工厂方法所属的类时,静态工厂方法返回的对象所属的类可以不必存在。这种灵活的静态工厂方法形成了服务提供者框架的基础,例如Java数据库链接API(JDBC)。服务提供者框架是一个系统:多个服务提供者实现一个服务,系统为客户端提供服务的多个实现,使客户端与服务实现解耦。
There are three essential components of a service provider framework: a service interface, which providers implement; a provider registration API, which the system uses to register implementations, giving clients access to them; and a service access API, which clients use to obtain an instance of the service. The service access API typically allows but does not require the client to specify some criteria for choosing a provider. In the absence of such a specification, the API returns an instance of a default implementation. The service access API is the “flexible static factory” that forms the basis of the service provider framework.
服务提供者框架有三个基本的组件:服务接口,提供者实现;提供者注册API,系统用来注册实现,使客户端能访问它们;服务访问API,客户端用来得到服务实例。服务访问API通常允许但不要求客户端指定一些选择提供者的规则。在没有指定的情况下,API返回一个默认的实现实例。服务访问API是”灵活的静态工厂”,其形成了服务提供者框架的基础。
An optional fourth component of a service provider framework is a service provider interface, which providers implement to create instances of their service implementation. In the absence of a service provider interface, implementations are registered by class name and instantiated reflectively (Item 53). In the case of JDBC, Connection
plays the part of the service interface, DriverManager.registerDriver
is the provider registration API, DriverManager.getConnection
is the service access API, and Driver
is the service provider interface.
服务提供者框架的第四个可选组件是服务提供者接口,服务提供者通过实现这个接口来创建服务实现的实例。在没有服务提供者接口的情况下,服务实现通过类名进行注册,通过反射来进行实例化(Item 53)。在JDBC的案例中,Connection
是服务接口,DriverManager.registerDriver
是提供者注册API,DriverManager.getConnection
服务访问API,Driver
是服务提供者接口。
There are numerous variants of the service provider framework pattern. For example, the service access API can return a richer service interface than the one required of the provider, using the Adapter pattern [Gamma95, p. 139]. Here is a simple implementation with a service provider interface and a default provider:
服务提供者框架模式有许多变种。例如,服务访问API通过使用适配器模式[Gamma95, p. 139],能返回比提供者需要的更更丰富的服务接口。下面是服务提供者接口的一个简单实现和默认的提供者:
1 | // Service provider framework sketch |
A fourth advantage of static factory methods is that they reduce the verbosity of creating parameterized type instances. Unfortunately, you must specify the type parameters when you invoke the constructor of a parameterized class even if they’re obvious from context. This typically requires you to provide the type parameters twice in quick succession:
静态工厂方法的第四个优势是它们降低了创建参数化类型实例的冗长性。遗憾的是,当你调用参数化类的构造函数时,你必须指定类型参数,即使它们在上下文中是非常明显的。这通常需要你紧接着提供两次类型参数:
1 | Map<String, List<String>> m = new HashMap<String, List<String>>(); |
This redundant specification quickly becomes painful as the length and complexity of the type parameters increase. With static factories, however, the compiler can figure out the type parameters for you. This is known as type inference. For example, suppose that HashMap
provided this static factory:
随着类型参数长度和复杂性的增加,这个冗长的说明很快就让人变得很痛苦。但是使用静态工厂的话,编译器可以为你找出类型参数。这被称为类型推导。例如,假设HashMap
由这个静态工厂提供:
1 | public static <K, V> HashMap<K, V> newInstance() { |
Then you could replace the wordy declaration above with this succinct alternative:
你可以将上面冗长的声明用下面简洁的形式去替换:
1 | Map<String, List<String>> m = HashMap.newInstance(); |
Someday the language may perform this sort of type inference on constructor invocations as well as method invocations, but as of release 1.6, it does not.
某一天,Java语言可能在构造函数调用上也有与方法调用类似的类型推导,但到发行版本1.6为止,它一直没有。
Unfortunately, the standard collection implementations such as HashMap
do not have factory methods as of release 1.6, but you can put these methods in your own utility class. More importantly, you can provide such static factories in your own parameterized classes.
遗憾的是,但到发行版本1.6为止,标准集合实现例如HashMap
没有工厂方法,但你可以把这些方法放到你自己的工具类力。更重要的是,你可以在你自己的参数化类里提供这样的静态工厂。
The main disadvantage of providing only static factory methods is that classes without public or protected constructors cannot be subclassed. The same is true for nonpublic classes returned by public static factories. For example, it is impossible to subclass any of the convenience implementation classes in the Collections Framework. Arguably this can be a blessing in disguise, as it encourages programmers to use composition instead of inheritance (Item 16).
只提供静态工厂方法的缺点是没有公有或保护构造函数的类不能进行子类化。公有静态工厂返回的非公有类同样如此。例如,不可能子类化集合框架中的这些便利实现类。可以说这是因祸得福,因为它鼓励程序员使用组合来代替继承(Item 16)。
A second disadvantage of static factory methods is that they are not readily distinguishable from other static methods. They do not stand out in API documentation in the way that constructors do, so it can be difficult to figure out how to instantiate a class that provides static factory methods instead of constructors. The Javadoc tool may someday draw attention to static factory methods. In the meantime, you can reduce this disadvantage by drawing attention to static factories in class or interface comments, and by adhering to common naming conventions. Here are some common names for static factory methods:
valueOf
— Returns an instance that has, loosely speaking, the same value as its parameters. Such static factories are effectively type-conversion methods.of
— A concise alternative tovalueOf
, popularized byEnumSet
(Item 32).getInstance
— Returns an instance that is described by the parameters but cannot be said to have the same value. In the case of a singleton,getInstance
takes no parameters and returns the sole instance.newInstance
— LikegetInstance
, except thatnewInstance
guarantees that each instance returned is distinct from all others.getType
— LikegetInstance
, but used when the factory method is in a different class.Type
indicates the type of object returned by the factory method.newType
— LikenewInstance
, but used when the factory method is in a different class.Type
indicates the type of object returned by the factory method.
静态工厂方法的第二个缺点是它们不能很容易的与其它静态方法进行区分。它们不能像构造函数那样在API文档中明确标识出来,因此很难弄明白怎样实例化一个提供静态工厂方法代替构造函数的类。Javadoc工具可能某一天会关注静态工厂方法。同时,你可以通过在类中或接口注释中注意静态工厂和遵循通用命名约定来减少这个劣势。下面是静态工厂方法的一些常用命名:
valueOf
— 不严格地说,返回一个与它的参数值相同的一个实例。这种静态工厂是有效的类型转换方法。of
—valueOf
的一种简洁替代方法,通过EnumSet
(Item 32)得到普及。getInstance
— 返回一个通过参数描述的实例,但不能说是相同的值。在单例情况下,getInstance
没有参数并且返回唯一的一个实例。newInstance
— 除了newInstance
保证每个返回的实例都是与其它的实例不同之外,其它的类似于getInstance
,getType
— 类似于getInstance
,当静态工厂方法在不同的类中时使用。Type
表示静态工厂方法返回的对象类型。newType
— 类似于newInstance
,当静态工厂方法在不同的类中时使用。Type
表示静态工厂方法返回的对象类型。
In summary, static factory methods and public constructors both have their uses, and it pays to understand their relative merits. Often static factories are preferable, so avoid the reflex to provide public constructors without first considering static factories.
总之,静态工厂方法和公有构造函数都有它们的作用,理解它们的相对优势是值得的。静态工厂经常是更合适的,因此要避免习惯性的提供公有构造函数而不首先考虑静态工厂。
Item 2: Consider a builder when faced with many constructor parameters**
Item 2:当面临很多构造函数参数时,要考虑使用构建器**
Static factories and constructors share a limitation: they do not scale well to large numbers of optional parameters. Consider the case of a class representing the Nutrition Facts label that appears on packaged foods. These labels have a few required fields—serving size, servings per container, and calories per serving and over twenty optional fields—total fat, saturated fat, trans fat, cholesterol, sodium, and so on. Most products have nonzero values for only a few of these optional fields.
静态工厂和构造函数有一个共同的限制:对于大量可选参数它们都不能很好的扩展。考虑这样一种情况:用一个类来表示包装食品上的营养成分标签。这些标签有几个字段是必须的——每份含量、每罐含量(份数)、每份的卡路里,二十个以上的可选字段——总脂肪量、饱和脂肪量、转化脂肪、胆固醇、钠等等。大多数产品中这些可选字段中的仅有几个是非零值。
What sort of constructors or static factories should you write for such a class? Traditionally, programmers have used the telescoping constructor pattern, in which you provide a constructor with only the required parameters, another with a single optional parameter, a third with two optional parameters, and so on, culminating in a constructor with all the optional parameters. Here’s how it looks in practice. For brevity’s sake, only four optional fields are shown:
你应该为这样的一个类写什么样的构造函数或静态工厂?习惯上,程序员使用重叠构造函数模式,在这种模式中只给第一个构造函数提供必要的参数,给第二个构造函数提供一个可选参数,给第三个构造函数提供两个可选参数,以此类推,最后的构造函数具有所有的可选参数。下面是一个实践中的例子。为了简便,只显示了四个可选字段:
1 | //Telescoping constructor pattern - does not scale well! |
When you want to create an instance, you use the constructor with the shortest parameter list containing all the parameters you want to set:
当你想创建一个实例时,你可以使用具有最短参数列表的构造函数,最短参数列表包含了所有你想设置的参数:
1 | NutritionFacts cocaCola = new NutritionFacts(240, 8, 100, 0, 35, 27); |
Typically this constructor invocation will require many parameters that you don’t want to set, but you’re forced to pass a value for them anyway. In this case, we passed a value of 0 for fat. With “only” six parameters this may not seem so bad, but it quickly gets out of hand as the number of parameters increases.
通常构造函数调用需要许多你不想设置的参数,但无论如何你不得不为它们传值。在这种情况下,我们给fat
传了一个零值。只有六个参数可能还不是那么糟糕,但随着参数数目的增长它很快就会失控。
In short, the telescoping constructor pattern works, but it is hard to write client code when there are many parameters, and harder still to read it. The reader is left wondering what all those values mean and must carefully count parameters to find out. Long sequences of identically typed parameters can cause subtle bugs. If the client accidentally reverses two such parameters, the compiler won’t complain, but the program will misbehave at runtime (Item 40).
简而言之,重叠构造函数模式有作用,但是当有许多参数时很难编写客户端代码,更难的是阅读代码。读者会很奇怪所有的这些值是什么意思,必须仔细的计算参数个数才能查明。一长串同类型的参数会引起细微的错误。如果客户端偶然的颠倒了两个这样的参数,编译器不会报错,但程序在运行时会出现错误的行为(Item 40)。
A second alternative when you are faced with many constructor parameters is the JavaBeans pattern, in which you call a parameterless constructor to create the object and then call setter methods to set each required parameter and each optional parameter of interest:
当你面临许多构造函数参数时,第二个替代选择是JavaBeans模式,在这种模式中你要调用无参构造函数来创建对象,然后调用setter
方法为每一个必要参数和每一个有兴趣的可选参数设置值:
1 | //JavaBeans Pattern - allows inconsistency, mandates mutability |
This pattern has none of the disadvantages of the telescoping constructor pattern. It is easy, if a bit wordy, to create instances, and easy to read the resulting code:
这个模式没有重叠构造函数模式的缺点。即使有点啰嗦,但它很容易创建实例,也很容易阅读写出来的代码:
1 | NutritionFacts cocaCola = new NutritionFacts(); |
Unfortunately, the JavaBeans pattern has serious disadvantages of its own. Because construction is split across multiple calls, a JavaBean may be in an inconsistent state partway through its construction. The class does not have the option of enforcing consistency merely by checking the validity of the constructor parameters. Attempting to use an object when it’s in an inconsistent state may cause failures that are far removed from the code containing the bug, hence difficult to debug. A related disadvantage is that the JavaBeans pattern precludes the possibility of making a class immutable (Item 15), and requires added effort on the part of the programmer to ensure thread safety.
遗憾的是,JavaBeans模式自身有着严重缺点。因为构造过程跨越多次调用,JavaBean在构造过程中可能会出现不一致的状态。JavaBean类不能只通过检查构造函数参数的有效性来保证一致性。当一个对象处于一种不一致的状态时,试图使用它可能会引起失败,这个失败很难从包含错误的代码中去掉,因此很难调试。与此相关的一个缺点是JavaBeans模式排除了使一个类不可变的可能性*(Item 15),因此需要程序员付出额外的努力来确保线程安全。
It is possible to reduce these disadvantages by manually “freezing” the object when its construction is complete and not allowing it to be used until frozen, but this variant is unwieldy and rarely used in practice. Moreover, it can cause errors at runtime, as the compiler cannot ensure that the programmer calls the freeze method on an object before using it.
当构造工作完成时,可以通过手动『冰冻』对象并且在冰冻完成之前不允许使用它来弥补这个缺点,但这种方式太笨重了,在实践中很少使用。而且,由于编译器不能保证程序员在使用对象之前调用了冰冻方法,因此它可能在运行时引起错误。
Luckily, there is a third alternative that combines the safety of the telescoping constructor pattern with the readability of the JavaBeans pattern. It is a form of the Builder pattern [Gamma95, p. 97]. Instead of making the desired object directly, the client calls a constructor (or static factory) with all of the required parameters and gets a builder object. Then the client calls setter-like methods on the builder object to set each optional parameter of interest. Finally, the client calls a parameterless build method to generate the object, which is immutable. The builder is a static member class (Item 22) of the class it builds. Here’s how it looks in practice:
幸运的是,这儿还有第三种替代方法,它结合了重叠构造函数模式的安全性和JavaBeans模式的可读性。它就是构建器模式[Gamma95, p. 97]。它不直接构建需要的对象,客户端调用具有所有参数的构造函数(或静态工厂),得到一个构造器对象。然后客户端在构建器上调用类似于setter的方法来设置每个感兴趣的可选参数。最终,客户端调用无参构建方法来产生一个对象,这个对象是不可变的。构建器是它要构建的类的静态成员类(Item 22)。它在实践中的形式如下:
1 | //Builder Pattern |
Note that NutritionFacts
is immutable, and that all parameter default values are in a single location. The builder’s setter methods return the builder itself so that invocations can be chained. Here’s how the client code looks:
注意NutritionFacts
是不可变的,所有参数的默认值都在一个单独的位置。构建器的setter
方法返回的是构建器本身,为的是可以链式调用。客户端代码如下:
1 | NutritionFacts cocaCola = new NutritionFacts.Builder(240, 8).calories(100).sodium(35).carbohydrate(27).build(); |
This client code is easy to write and, more importantly, to read. The Builder pattern simulates named optional parameters as found in Ada and Python.
Like a constructor, a builder can impose invariants on its parameters. The build method can check these invariants. It is critical that they be checked after copying the parameters from the builder to the object, and that they be checked on the object fields rather than the builder fields (Item 39). If any invariants are violated, the build method should throw an IllegalStateException
(Item 60). The exception’s detail method should indicate which invariant is violated (Item 63).
客户端代码很容器写,更重要的是很容易读。构建器模式模拟了命名可选参数,就像Ada和Python中的一样。类似于构造函数,构造器可以对它参数加上约束条件。构造器方法可以检查这些约束条件。将参数从构建器拷贝到对象中之后,可以在对象作用域而不是构造器作用域对约束条件进行检查,这是很关键的(Item 39)。如果违反了任何约束条件,构造器方法会抛出IllegalStateException
异常(Item 60)。异常的详细信息会指出违反了哪一个约束条件(Item 63)。
Another way to impose invariants involving multiple parameters is to have setter methods take entire groups of parameters on which some invariant must hold. If the invariant isn’t satisfied, the setter method throws an IllegalArgumentException
. This has the advantage of detecting the invariant failure as soon as the invalid parameters are passed, instead of waiting for build
to be invoked.
给许多参数加上约束条件的另一种方式是对某些约束条件必须持有的整组参数用setter方法进行检查,如果没有满足约束条件,setter方法会抛出IllegalArgumentException
异常。这个优点在于是一旦传递了无效参数,检测约束条件会失败,而不是等待build
被调用。
A minor advantage of builders over constructors is that builders can have multiple varargs parameters. Constructors, like methods, can have only one varargs parameter. Because builders use separate methods to set each parameter, they can have as many varargs parameters as you like, up to one per setter method.
相比于构造函数,构建器的一个小优势在与构建器可以有许多可变参数。构造函数类似于方法,只能有一个可变参数。由于构造器用单独的方法设置每一个参数,因此像你喜欢的那样,它们能有许多可变参数,直到每个setter方法都有一个可变参数。
The Builder pattern is flexible. A single builder can be used to build multiple objects. The parameters of the builder can be tweaked between object creations to vary the objects. The builder can fill in some fields automatically, such as a serial number that automatically increases each time an object is created.
构建器模式是灵活的。一个构建器可以用来构建多个对象。为了改变对象,构建器参数在创建对象时可以进行改变。构建器能自动填充一些字段,例如每次创建对象时序号自动增加。
A builder whose parameters have been set makes a fine Abstract Factory [Gamma95, p. 87]. In other words, a client can pass such a builder to a method to enable the method to create one or more objects for the client. To enable this usage, you need a type to represent the builder. If you are using release 1.5 or a later release, a single generic type (Item 26) suffices for all builders, no matter what type of object they’re building:
设置了参数的构建器形成了一个很好的抽象工厂[Gamma95,p.87]。换句话说,为了使某个方法能为客户端创建一个或多个对象,客户端可以传递这样的一个构建器到这个方法中。为了使这个用法可用,你需要用一个类型来表示构建器。如果你在使用JDK 1.5或之后的版本,只要一个泛型就能满足所有的构建器(Item 26),无论正在构建的是什么类型:
1 | // A builder for objects of type T |
Note that our NutritionFacts.Builder
class could be declared to implement Builder<NutritionFacts>
.
注意我们可以声明NutritionFacts.Builder
类来实现Builder<NutritionFacts>
。
Methods that take a Builder instance would typically constrain the builder’s type parameter using a bounded wildcard type (Item 28). For example, here is a method that builds a tree using a client-provided Builder instance to build each node:
带有构建器实例的方法通常使用绑定的通配符类型来约束构建器的类型参数(Item 28)。例如,构建树的方法通过使用客户端提供的构建器实例来构建每一个结点:
1 | Tree buildTree(Builder<? extends Node> nodeBuilder) { ... } |
The traditional Abstract Factory implementation in Java has been the Class object, with the newInstance
method playing the part of the build
method. This usage is fraught with problems. The newInstance
method always attempts to invoke the class’s parameterless constructor, which may not even exist. You don’t get a compile-time error if the class has no accessible parameterless constructor. Instead, the client code must cope with InstantiationException
or IllegalAccessException
at runtime, which is ugly and inconvenient. Also, the newInstance
method propagates any exceptions thrown by the parameterless constructor, even though newInstance
lacks the corresponding throws clauses. In other words, Class.newInstance
breaks compile-time exception checking. The Builder
interface, shown above, corrects these deficiencies.
Java中传统的抽象工厂实现是类对象,newInstance
方法扮演着build
方法的角色。 这种用法问题重重。newInstance
方法总是尝试调用类的无参构造函数,但无参构造函数可能并不存在。如果类没有访问无参构造函数,你不会收到编译时错误。而客户端代码必须处理运行时的InstantiationException
或IllegalAccessException
异常,这样既不雅观也不方便。newInstance
也会传播无参构造函数抛出的任何异常,即使newInstance
缺少对应的抛出语句块。换句话说,Class.newInstance
打破了编译时的异常检测。上面的Builder
接口弥补了这些缺陷。
The Builder pattern does have disadvantages of its own. In order to create an object, you must first create its builder. While the cost of creating the builder is unlikely to be noticeable in practice, it could be a problem in some performance-critical situations. Also, the Builder pattern is more verbose than the telescoping constructor pattern, so it should be used only if there are enough parameters, say, four or more. But keep in mind that you may want to add parameters in the future. If you start out with constructors or static factories, and add a builder when the class evolves to the point where the number of parameters starts to get out of hand, the obsolete constructors or static factories will stick out like a sore thumb. Therefore, it’s often better to start with a builder in the first place.
构建器模式也有它的缺点。为了创建对象,你必须首先创建它的构建器。虽然创建构建器的代价在实践中可能不是那么明显,但在某些性能优先关键的情况下它可能是一个问题。构建器模式比重叠构造函数模式更啰嗦,因此只有在参数足够多的情况下才去使用它,比如四个或更多。但要记住将来你可能会增加参数。如果你开始使用构造函数或静态工厂,当类发展到参数数目开始失控的情况下,才增加一个构建器,废弃的构造函数或静态工厂就像一个疼痛的拇指,最好是在开始就使用构建器。
In summary, the Builder pattern is a good choice when designing classes whose constructors or static factories would have more than a handful of parameters, especially if most of those parameters are optional. Client code is much easier to read and write with builders than with the traditional telescoping constructor pattern, and builders are much safer than JavaBeans.
总之,当设计的类的构造函数或静态工厂有许多参数时,构建器模式是一个很好的选择,尤其是大多数参数是可选参数的情况下。与传统的重叠构造函数模式相比,使用构建器模式的客户端代码更易读易编写,与JavaBeans模式相比使用构建器模式更安全。
Item 3 Enforce the singleton property with a private constructor or an enum type
A singleton is simply a class that is instantiated exactly once [Gamma95, p. 127]. Singletons typically represent a system component that is intrinsically unique, such as the window manager or file system. Making a class a singleton can make it difficult to test its clients, as it’s impossible to substitute a mock implementation for a singleton unless it implements an interface that serves as its type.
单例简单来说就是一个类只被实例化一次[Gamma95, p. 127]。通常单例表示一个系统组件在本质上来说是唯一的,例如窗口管理或文件系统。一个类成为单例会使它的客户端测试变得很困难,因为不可能用伪实现来代替单例,除非它实现了一个接口,这个接口作为它的服务类型。
Before release 1.5, there were two ways to implement singletons. Both are based on keeping the constructor private and exporting a public static member to provide access to the sole instance. In one approach, the member is a final field:
在1.5版本之前,有两种方式来实现单例。它们都是通过保持私有构造函数并输出一个公有静态成员来提供对类唯一实例的访问来实现的。在第一种方法中,公有静态成员被声明为final变量:
1 | // Singleton with public final field |
The private constructor is called only once, to initialize the public static final field Elvis.INSTANCE
. The lack of a public or protected constructor guarantees a “monoelvistic” universe: exactly one Elvis
instance will exist once the Elvis
class is initialized—-no more, no less. Nothing that a client does can change this, with one caveat: a privileged client can invoke the private constructor reflectively (Item 53) with the aid of the AccessibleObject.setAccessible
method. If you need to defend against this attack, modify the constructor to make it throw an exception if it’s asked to create a second instance.
为了初始化公有静态final变量Elvis.INSTANCE
,私有构造函数只调用一次。公有或保护构造函数的缺失保证了全局唯一性:确切的说一旦Elvis
类初始化,将只有一个Elvis
实例存在——不会多也不会少。客户端不能改变这个情况,但要提醒一点:有特权的客户端可以借用AccessibleObject.setAccessible
方法方法,通过反射机制(Item 53)的调用私有构造函数。如果你需要抵御这种攻击,修改构造函数使它在创建第二个实例时抛出异常。
In the second approach to implementing singletons, the public member is a static factory method:
在第二种实现单例的方法中,公有成员是一个静态工厂方法:
1 | // Singleton with static factory |
All calls to Elvis.getInstance
return the same object reference, and no other Elvis
instance will ever be created (with the same caveat mentioned above).
所有Elvis.getInstance
方法的调用都会返回同一个对象实例,并且不会有其它的Elvis
实例被创建(提醒同上)。
The main advantage of the public field approach is that the declarations make it clear that the class is a singleton: the public static field is final, so it will always contain the same object reference. There is no longer any performance advantage to the public field approach: modern Java virtual machine (JVM) implementations are almost certain to inline the call to the static factory method.
公有变量方法的主要优势在于更清晰的声明这个类是一个单例类:公有静态变量是final的,因此它总是包含同一个对象的引用。公有变量方法没有任何性能优势:现代Java虚拟机(JVM)的大多数实现都是将静态工厂方法当做内联函数来调用。
One advantage of the factory-method approach is that it gives you the flexibility to change your mind about whether the class should be a singleton without changing its API. The factory method returns the sole instance but could easily be modified to return, say, a unique instance for each thread that invokes it. A second advantage, concerning generic types, is discussed in Item 27. Often neither of these advantages is relevant, and the final-field approach is simpler.
工厂方法的一个优势在于你可以灵活的改变你的想法,无论类是否是单例你都不必修改它的API。工厂方法返回唯一的实例,但它很容易被修改成为每个调用它的线程都返回一个唯一的实例。第二个优势是关于泛型的,在Item 27讨论。这些优势往往都是相关的,final变量方法更简单。
To make a singleton class that is implemented using either of the previous approaches serializable (Chapter 11), it is not sufficient merely to add implements Serializable
to its declaration. To maintain the singleton guarantee, you have to declare all instance fields transient and provide a readResolve
method (Item 77). Otherwise, each time a serialized instance is deserialized, a new instance will be created, leading, in the case of our example, to spurious Elvis
sightings. To prevent this, add this readResolve
method to the Elvis
class:
为了使上面方法实现的单例类可序列化(第11章),仅仅在它的声明中实现Serializable
接口是不够的。为了保证单例性,你必须将所有的实例变量声明为transient
并提供一个readResolve
方法(Item 77)。否则,每次一个序列化的实例在反序列化时将会创建一个新的实例,在我们的例子中,会看到一个假的Elvis
。为了防止这种情况发生,要在Elvis
类中添加readResolve
方法:
1 | // readResolve method to preserve singleton property |
As of release 1.5, there is a third approach to implementing singletons. Simply make an enum type with one element:
在1.5版本中,有第三种实现单例的方法。简单声明一个只有一个元素的枚举类型:
1 | // Enum singleton - the preferred approach |
This approach is functionally equivalent to the public field approach, except that it is more concise, provides the serialization machinery for free, and provides an ironclad guarantee against multiple instantiation, even in the face of sophisticated serialization or reflection attacks. While this approach has yet to be widely adopted, a single-element enum type is the best way to implement a singleton.
这个方法除了它更简洁之外,它在功能上等价于公有变量方法,免费提供了序列化机制,并且强有力的保证了不会被多次实例化,即使是在面临复杂的序列化或反射攻击时。虽然这个方法仍没有被广泛采用,但单元素的枚举类型是实现单例的最好方式。
Item 4: Enforce noninstantiability with a private constructor
Occasionally you’ll want to write a class that is just a grouping of static methods and static fields. Such classes have acquired a bad reputation because some people abuse them to avoid thinking in terms of objects, but they do have valid uses. They can be used to group related methods on primitive values or arrays, in the manner of java.lang.Math
or java.util.Arrays
. They can also be used to group static methods, including factory methods (Item 1), for objects that implement a particular interface, in the manner of java.util.Collections
. Lastly, they can be used to group methods on a final class, instead of extending the class.
有时你会想写一个只包含一组静态方法和静态变量的类。这种类的名声很不好,因为有些人滥用它们来避免思考如何面向对象,但它们确实是有用的。它们可以用来以java.lang.Math
或java.util.Arrays
的方式来组织与基本类型或数组相关的方法。它们也可以用来以java.util.Collections
的方式来组织实现特定接口对象的静态方法,包括工厂方法(Item 1)。最后,它们可以用来组织一个fianl类的方法,从而代替扩展这个类。
Such utility classes were not designed to be instantiated: an instance would be nonsensical. In the absence of explicit constructors, however, the compiler provides a public, parameterless default constructor. To a user, this constructor is indistinguishable from any other. It is not uncommon to see unintentionally instantiable classes in published APIs.
这种工具类被设计成不能实例化:它的实例是没有意义的。然而,在缺少显式构造函数的情况下,编译器会提供一个公有的无参构造默认函数。对用户而言,这个构造函数与其它的构造函数没有任何差别。在发布的APIs中看到无意义的可实例化类是很罕见的。
Attempting to enforce noninstantiability by making a class abstract does not work. The class can be subclassed and the subclass instantiated.
Furthermore, it misleads the user into thinking the class was designed for inheritance (Item 17). There is, however, a simple idiom to ensure noninstantiability. A default constructor is generated only if a class contains no explicit constructors, so a class can be made noninstantiable by including a private constructor:
企图通过声明一个类为抽象类来强制类不能被实例化是行不通的。这个类可以被子类化,子类可以被实例化。而且,它会使用户误认为这个类是为继承而设计的(Item 17)。然而有一些简单的习惯用法可以确保类不能被实例化。如果一个类没有显式的构造函数,会产生默认的构造函数,因此,一个含有私有构造函数的类不能被实例化:
1 | // Noninstantiable utility class |
Because the explicit constructor is private, it is inaccessible outside of the class. The AssertionError
isn’t strictly required, but it provides insurance in case the constructor is accidentally invoked from within the class. It guarantees that the class will never be instantiated under any circumstances. This idiom is mildly counterintuitive, as the constructor is provided expressly so that it cannot be invoked. It is therefore wise to include a comment, as shown above.
因为显式构造函数是私有的,因此类的外部不能访问构造函数。AssertionError
不是必须的,但它可以避免类内部无意的调用构造函数。这种习惯用法有点违背直觉,似乎构造函数的提供就是为了它不能被调用一样。因此明智的做法是在类中加上注释,像上面的例子一样。
As a side effect, this idiom also prevents the class from being subclassed. All constructors must invoke a superclass constructor, explicitly or implicitly, and a subclass would have no accessible superclass constructor to invoke.
这种习惯用法的一个副作用就是阻止了类的子类化。子类的所有的构造函数必须调用父类的构造函数,无论是显式的或隐式的,但这种情况下子类不能调用父类构造函数。
Item 5: Avoid creating unnecessary objects
It is often appropriate to reuse a single object instead of creating a new functionally equivalent object each time it is needed. Reuse can be both faster and more stylish. An object can always be reused if it is immutable (Item 15).
每次需要一个对象时,与创建一个新的功能相同的对象相比,复用一个对象经常是合适的。复用更快更流行。如果一个对象是不变的,那它总是可以复用。(Item 15)
As an extreme example of what not to do, consider this statement:
下面是一个不该做什么的极端例子:
1 | String s = new String("stringette"); // DON'T DO THIS! |
The statement creates a new String
instance each time it is executed, and none of those object creations is necessary. The argument to the String
constructor (“stringette”) is itself a String
instance, functionally identical to all of the objects created by the constructor. If this usage occurs in a loop or in a frequently invoked method, millions of String
instances can be created needlessly.
这条语句每次执行时都会创建一个新的String
实例,这些对象的创建都是没必要的。String
构造函数的参数"stringette"
本身就是一个String
实例,在功能上与构造函数创建的所有对象都是等价的。如果这种用法出现在一个循环或一个频繁调用的方法中,会创建出成千上万的不必要的String
实例。
The improved version is simply the following:
改进版本如下:
1 | String s = "stringette"; |
This version uses a single String
instance, rather than creating a new one each time it is executed. Furthermore, it is guaranteed that the object will be reused by any other code running in the same virtual machine that happens to contain the same string literal [JLS, 3.10.5].
这个版本使用单个的String
实例,而不是每次执行时创建一个新实例。此外,它保证了运行在虚拟中包含同样字符串的任何其它代码都可以复用这个对象[JLS, 3.10.5]。
You can often avoid creating unnecessary objects by using static factory methods (Item 1) in preference to constructors on immutable classes that provide both. For example, the static factory method Boolean.valueOf
(String) is almost always preferable to the constructor Boolean
(String). The constructor creates a new object each time it’s called, while the static factory method is never required to do so and won’t in practice.
对于提供了构造函数和静态工厂方法的不变类,使用静态工厂方法(Item 1)优先于构造函数常常可以让你避免创建不必要的对象。例如,静态工厂方法Boolean.valueOf
(String)总是优先于构造函数Boolean
(String)。每次调用构造函数都会创建一个新的对象,而静态工厂方法从来不要求这样做,在实践中也不会这样做。
In addition to reusing immutable objects, you can also reuse mutable objects if you know they won’t be modified. Here is a slightly more subtle, and much more common, example of what not to do. It involves mutable Date
objects that are never modified once their values have been computed. This class models a person and has an isBabyBoomer
method that tells whether the person is a “baby boomer”, in other words, whether the person was born between 1946 and 1964:
除了复用不可变对象之外,如果你知道可变对象不会被修改,你也可以复用可变对象。下面是一个比较微妙,更为常见反面例子。它包含可变的Date
对象,这些Date
对象一旦计算出来就不再修改。这个类对人进行了建模,其中有一个isBabyBoomer
方法用来区分这个人是否是一个“baby boomer(生育高峰时的小孩)”,换句话说就是判断这个人是否出生在1946年到1964年之间:
1 | public class Person { |
The isBabyBoomer
method unnecessarily creates a new Calendar
, TimeZone
, and two Date
instances each time it is invoked. The version that follows avoids this inefficiency with a static initializer:
每次调用时,isBabyBoomer
方法都会创建一个Calendar
实例,一个TimeZone
实例和两个Date
实例,这是不必要的。下面的版本用静态初始化避免了这种低效率的问题:
1 | class Person { |
The improved version of the Person
class creates Calendar
, TimeZone
, and Date
instances only once, when it is initialized, instead of creating them every time isBabyBoomer
is invoked. This results in significant performance gains if the method is invoked frequently. On my machine, the original version takes 32,000 ms for 10 million invocations, while the improved version takes 130 ms, which is about 250 times faster. Not only is performance improved, but so is clarity. Changing boomStart
and boomEnd
from local variables to static final
fields makes it clear that these dates are treated as constants, making the code more understandable. In the interest of full disclosure, the savings from this sort of optimization will not always be this dramatic, as Calendar
instances are particularly expensive to create.
Person
类的改进版本只在初始化时创建Calendar
,TimeZone
和Date
实例一次,而不是每次调用isBabyBoomer
方法都创建它们。如果isBabyBoomer
方法被频繁调用的话,这样做在性能上会有很大提升。在我的机器上,最初的版本一千万次调用要花费32,000毫秒,而改进版本只花了130毫秒,比最初版本快了大约250倍。不仅性能改善了,代码也更清晰了。将boomStart
和boomEnd
从局部变量变为static final
字段,很明显是将它们看作常量,代码也更容易理解。从整体收益来看,这种优化的节约并不总是这么戏剧性的,因为Calendar
实例创建的代价是非常昂贵的。
If the improved version of the Person
class is initialized but its isBabyBoomer
method is never invoked, the BOOM_START
and BOOM_END
fields will be initialized unnecessarily. It would be possible to eliminate the unnecessary initializations by lazily initializing these fields (Item 71) the first time the isBabyBoomer
method is invoked, but it is not recommended. As is often the case with lazy initialization, it would complicate the implementation and would be unlikely to result in a noticeable performance improvement beyond what we’ve already achieved (Item 55).
如果初始化Person
类的改进版本,但从不调用它的isBabyBoomer
方法,BOOM_START
和BOOM_END
字段的初始化就是不必要的。可以通过延迟初始化(当需要时再初始化)这些字段(Item 71)来消除这些不必要的初始化,当第一次调用isBabyBoomer
方法时再进行初始化,但不推荐这样做。延迟初始化是常有的事,它的实现是非常复杂的,除了我们已有的性能提升之外,延迟初始化不可能引起明显的性能提升(Item 55)。
In the previous examples in this item, it was obvious that the objects in question could be reused because they were not modified after initialization. There are other situations where it is less obvious. Consider the case of adapters [Gamma95, p. 139], also known as views. An adapter is an object that delegates to a backing object, providing an alternative interface to the backing object. Because an adapter has no state beyond that of its backing object, there’s no need to create more than one instance of a given adapter to a given object.
在本条目前面的例子中,很明显问题中的对象可以复用,因为它们在初始化之后没有被修改。但在其它的情况下它就不那么明显了。考虑一个适配器的情况[Gamma95, p. 139],也称之为视图。适配器是代理支持对象的对象,为支持对象提供了一个可替代的接口。由于适配器除了它的支持对象之外没有别的状态,因此没必要创建多个给定对象的适配器实例。
For example, the keySet
method of the Map
interface returns a Set
view of the Map
object, consisting of all the keys in the map. Naively, it would seem that every call to keySet
would have to create a new Set
instance, but every call to keySet
on a given Map
object may return the same Set
instance. Although the returned Set
instance is typically mutable, all of the returned objects are functionally identical: when one of the returned objects changes, so do all the others because they’re all backed by the same Map
instance. While it is harmless to create multiple instances of the keySet
view object, it is also unnecessary.
例如,Map
接口的keySet
方法返回一个Map
对象的Set
视图,包含了map中所有的keys。乍一看,好像每一次调用keySet
方法都会创建一个新的Set
实例,但在一个给定的Map
对象上每次调用keySet
方法可能返回的都是同一个Set
实例。虽然返回的Set
实例通常都是可变的,但所有的返回对象在功能上是等价的:当一个返回对象改变时,其它的都要改变,因为它们都由同一个Map
实例支持。虽然创建多个keySet
视图对象的实例是无害的,但它是没必要的。
There’s a new way to create unnecessary objects in release 1.5. It is called autoboxing, and it allows the programmer to mix primitive and boxed primitive types, boxing and unboxing automatically as needed. Autoboxing blurs but does not erase the distinction between primitive and boxed primitive types. There are subtle semantic distinctions, and not-so-subtle performance differences (Item 49). Consider the following program, which calculates the sum of all the positive int
values. To do this, the program has to use long arithmetic, because an int
is not big enough to hold the sum of all the positive int
values:
在JDK 1.5中有一种新的方式来创建不必要对象。它被称为自动装箱,它允许程序员混合使用基本类型和它们的包装类型,JDK会在需要时自动装箱和拆箱,自动装箱虽然模糊但不能去除基本类型和包装类之间的区别。它们在语义上有稍微的不同,但不是轻微的性能差异(Item 49)。看一下下面的程序,计算所有正数int
值的总和。为了计算这个,程序必须使用long
类型,因为int
不能容纳所有正int
值的和:
1 | // Hideously slow program! Can you spot the object creation? |
This program gets the right answer, but it is much slower than it should be, due to a one-character typographical error. The variable sum
is declared as a Long
instead of a long
, which means that the program constructs about 2^31 unnecessary Long
instances (roughly one for each time the long i
is added to the Long sum
). Changing the declaration of sum
from Long
to long
reduces the runtime from 43 seconds to 6.8 seconds on my machine. The lesson is clear: prefer primitives to boxed primitives, and watch out for unintentional autoboxing.
这个程序算出了正确答案,但由于一个字符的错误,它运行的更慢一些。变量sum
声明为Long
而不是long
,这意味着程序构建了大约2^31不必要的Long
实例(基本上每次long i
加到Long sum
上都要创建一个)。将sum
从Long
声明为long
之后,在我机器上运行时间从43秒降到了6.8秒。结论很明显:使用基本类型优先于包装类,当心无意的自动装箱。
This item should not be misconstrued to imply that object creation is expensive and should be avoided. On the contrary, the creation and reclamation of small objects whose constructors do little explicit work is cheap, especially on modern JVM implementations. Creating additional objects to enhance the clarity, simplicity, or power of a program is generally a good thing.
不该将本条目误解成暗示创建对象是昂贵的,应该避免创建对象。恰恰相反,创建和回收构造函数做很少显式工作的小对象是非常廉价的,尤其是在现代的JVM实现上。创建额外的对象来增强程序的清晰性,简洁性,或能力通常是一件好事。
Conversely, avoiding object creation by maintaining your own object pool is a bad idea unless the objects in the pool are extremely heavyweight. The classic example of an object that does justify an object pool is a database connection. The cost of establishing the connection is sufficiently high that it makes sense to reuse these objects. Also, your database license may limit you to a fixed number of connections. Generally speaking, however, maintaining your own object pools clutters your code, increases memory footprint, and harms performance. Modern JVM implementations have highly optimized garbage collectors that easily outperform such object pools on lightweight objects.
相反的,通过维护你自己的对象池来避免创建对象是一个坏主意,除非对象池中的对象是极度重量级的。真正证明对象池的对象经典例子是数据库连接。建立连接的代价是非常大的,因此复用这些对象是很有意义的。数据库许可可能也限制你使用固定数目的连接。但是,通常来说维护你自己的对象池会使你的代码很乱,增加内存占用,而且损害性能。现代JVM实现有高度优化的垃圾回收机制,维护轻量级对象很容易比对象池做的更好。
The counterpoint to this item is Item 39 on defensive copying. Item 5 says, “Don’t create a new object when you should reuse an existing one,” while Item 39 says, “Don’t reuse an existing object when you should create a new one.” Note that the penalty for reusing an object when defensive copying is called for is far greater than the penalty for needlessly creating a duplicate object. Failing to make defensive copies where required can lead to insidious bugs and security holes; creating objects unnecessarily merely affects style and performance.
与本条目对应的是Item 39 保护性拷贝。Item 5 声称,『不要创建一个新的对象,当你应该复用一个现有的对象时』,而Item 39 声称,『不要重用一个现有的对象,当你应该创建一个新的对象时』。注意,当保护性拷贝时复用一个对象的代价要远大于创建一个不必要的重复对象的代价。当需要时没有创建一个保护性拷贝可能导致潜在的错误和安全漏洞;创建不必要的对象只会影响程序风格及性能。
Item 6: Eliminate obsolete object references
When you switch from a language with manual memory management, such as C or C++, to a garbage-collected language, your job as a programmer is made much easier by the fact that your objects are automatically reclaimed when you’re through with them. It seems almost like magic when you first experience it. It can easily lead to the impression that you don’t have to think about memory management, but this isn’t quite true.
当你从一个手动管理内存的语言(例如C或C++)转到一个具有垃圾回收机制的语言时,作为一个程序员你的工作会更容易,当你使用完对象时,它们会被自动回收。当你第一个经历它时,它简直不可思议。它很容易给你留下一个你不需要考虑内存管理的印象,但事实并非如此。
Consider the following simple stack implementation:
考虑下面一种简单的栈实现的情况:
1 | // Can you spot the "memory leak"? |
There’s nothing obviously wrong with this program (but see Item 26 for a generic version). You could test it exhaustively, and it would pass every test with flying colors, but there’s a problem lurking. Loosely speaking, the program has a “memory leak,” which can silently manifest itself as reduced performance due to increased garbage collector activity or increased memory footprint. In extreme cases, such memory leaks can cause disk paging and even program failure with an OutOfMemoryError
, but such failures are relatively rare.
这个程序没有明显的错误(但请看Item 26的泛型版本)。你可以对它进行全面测试,它能出色的通过每一次测试,但这儿有一个潜在的问题。不严格的说,这个程序有一个『内存泄露』问题,由于垃圾回收活动的增加或内存占用的增加,性能下降的情况会逐渐表现出来。在极端的情况下,这种内存泄露可能引起磁盘分页,甚至会引起程序失败(OutOfMemoryError
),但这种失败是相对稀少的。
So where is the memory leak? If a stack grows and then shrinks, the objects that were popped off the stack will not be garbage collected, even if the program using the stack has no more references to them. This is because the stack maintains obsolete references to these objects. An obsolete reference is simply a reference that will never be dereferenced again. In this case, any references outside of the “active portion” of the element array are obsolete. The active portion consists of the elements whose index is less than size
.
内存泄露在哪呢?如果栈先增长后收缩,出栈的对象将不能作为垃圾被收回,即使使用栈的程序不再引用它们。这是因为栈维护着这些对象的废弃引用。废弃引用是永远不会再解引用的引用。在这种情况下,元素数组活跃部分之外的其它引用都将被废弃。活跃部分包含了那些索引小于size
的元素。
Memory leaks in garbage-collected languages (more properly known as unintentional object retentions) are insidious. If an object reference is unintentionally retained, not only is that object excluded from garbage collection, but so too are any objects referenced by that object, and so on. Even if only a few object references are unintentionally retained, many, many objects may be prevented from being garbage collected, with potentially large effects on performance.
内存泄露在垃圾回收语言是隐蔽的(更合适的称呼是无意识对象保持)。如果一个对象引用被无意保留,不仅这个对象不能被垃圾回收处理,而且这个对象引用的其它对象也不能被垃圾回收处理,以此类推。即使只无意保留了几个对象的引用,但可能阻止了垃圾回收机制回收许多其它的对象,在性能上会有很大的潜在影响。
The fix for this sort of problem is simple: null out references once they become obsolete. In the case of our Stack
class, the reference to an item becomes obsolete as soon as it’s popped off the stack
. The corrected version of the pop
method looks like this:
这类问题的修正很简单:一旦对象引用过期,就清空这些引用。在我们的Stack
类例子中,只要某一项从栈中取出,它的引用就废弃了。pop
方法的修正版本如下:
1 | public Object pop() { |
An added benefit of nulling out obsolete references is that, if they are subsequently dereferenced by mistake, the program will immediately fail with a NullPointerException
, rather than quietly doing the wrong thing. It is always beneficial to detect programming errors as quickly as possible.
清空废弃引用的一个额外收益是,如果它们接下来被误解引用,程序会立刻抛出NullPointerException
,而不是静静地做错误的事情。对于尽可能快的检测程序错误,它总是有益的。
When programmers are first stung by this problem, they may overcompensate by nulling out every object reference as soon as the program is finished using it. This is neither necessary nor desirable, as it clutters up the program unnecessarily. Nulling out object references should be the exception rather than the norm. The best way to eliminate an obsolete reference is to let the variable that contained the reference fall out of scope. This occurs naturally if you define each variable in the narrowest possible scope (Item 45).
当程序员第一次被这个问题困扰时,他们可能是过分小心了,程序一旦完成了对象的使用,就清空每一个对象的引用。这既没必要也不可取,因此它会将程序不必要的弄乱。清空对象引用应该是例外情况而不是正常的行为。消除废弃引用的最好方式是让包含引用的变量结束其作用域。如果你在最紧凑的作用域范围内定义每个变量,这会很自然的发生。
So when should you null out a reference? What aspect of the Stack
class makes it susceptible to memory leaks? Simply put, it manages its own memory. The storage pool consists of the elements of the elements array (the object reference cells, not the objects themselves). The elements in the active portion of the array (as defined earlier) are allocated, and those in the remainder of the array are free. The garbage collector has no way of knowing this; to the garbage collector, all of the object references in the elements array are equally valid. Only the programmer knows that the inactive portion of the array is unimportant. The programmer effectively communicates this fact to the garbage collector by manually nulling out array elements as soon as they become part of the inactive portion.
你应该什么时候清空一个引用?Stack
类的哪一个方面让它容易受到内存泄露影响?简单的说,它自己管理自己的内存。存储池包含了元素数组中的元素(对象引用单元,不是对象本身)。数组活跃部分的元素(前面定义的)被分配,数组中其余的元素是自由的。垃圾回收器不知道这种情况;对于垃圾回收器而言,元素数组中的所有对象引用都同等有效。只有程序员知道数组中非活跃部分是不重要的。程序员通过手动清空数组元素中不活跃的部分,可以有效的告诉垃圾回收器这个事实。
Generally speaking, whenever a class manages its own memory, the programmer should be alert for memory leaks. Whenever an element is freed, any object references contained in the element should be nulled out.
一般来说,只要一个类自己管理自己的内存,程序员就应该警惕内存泄露。无论什么时候释放一个元素,这个元素包含的对象引用都应该被清空。
Another common source of memory leaks is caches. Once you put an object reference into a cache, it’s easy to forget that it’s there and leave it in the cache long after it becomes irrelevant. There are several solutions to this problem. If you’re lucky enough to implement a cache for which an entry is relevant exactly so long as there are references to its key outside of the cache, represent the cache as a WeakHashMap
; entries will be removed automatically after they become obsolete. Remember that WeakHashMap
is useful only if the desired lifetime of cache entries is determined by external references to the key, not the value.
另一个常见的内存泄露来源是缓存。一旦你把一个对象引用放入缓存,很容易忘了它在缓存中,在用完之后很长一段时间仍把它放在缓存中。这个问题有几种解决方案。如果你很幸运的要实现一个对于输入项的缓存,只要缓存外部有输入项的键的引用,它就是相对确定的,可以用一个WeakHashMap
来表示缓存;在输入项废弃之后,它们会被自动移除。记住,只有缓存输入项的生命周期由输入项键的外部引用决定,不是由输入项值的外部引用决定时,WeakHashMap
才有用的。
More commonly, the useful lifetime of a cache entry is less well defined, with entries becoming less valuable over time. Under these circumstances, the cache should occasionally be cleansed of entries that have fallen into disuse. This can be done by a background thread (perhaps a Timer
or ScheduledThreadPoolExecutor
) or as a side effect of adding new entries to the cache. The LinkedHashMap
class facilitates the latter approach with its removeEldestEntry
method. For more sophisticated caches, you may need to use java.lang.ref directly
.
更常见的是,缓存输入项的有效生命周期是不太好定义的,随时间推移缓存输入项变的更没价值。在这些情况下,缓存应该时不时的清除停止使用的缓存输入项。这项工作可以通过一个后台线程去做(可能是一个Timer
或ScheduledThreadPoolExecutor
)或在新的输入项添加到缓存中时顺便去做。LinkedHashMap
类利用它的removeEldestEntry
方法可以很容易实现后面的方法。对于更复杂的缓存,你可能需要使用java.lang.ref directly
。
A third common source of memory leaks is listeners and other callbacks.If you implement an API where clients register callbacks but don’t deregister them explicitly, they will accumulate unless you take some action. The best way to ensure that callbacks are garbage collected promptly is to store only weak references to them, for instance, by storing them only as keys in a WeakHashMap
.
第三个常见的内存泄露来源是监听器和其它的回调函数。如果你实现一个API,它的客户端注册了回调函数但没有显式的注销它们,除非你采取一些动作,否则它们将累积。确保回调函数可以迅速被垃圾回收的最好方式是为存储它们的弱引用,例如,只将它们保存为WeakHashMap
的键。
Because memory leaks typically do not manifest themselves as obvious failures, they may remain present in a system for years. They are typically discovered only as a result of careful code inspection or with the aid of a debugging tool known as a heap profiler. Therefore, it is very desirable to learn to anticipate problems like this before they occur and prevent them from happening.
因为通常内存泄露没有明白的失败来揭露它们,它们可能在系统中存在许多年。通常只有通过小心的代码检查或通过调试工具(通常被称为堆分析器)的帮助才能发现它们。因此,在它们发生和阻止它们发生之前,就学习预测这种问题是很有必要的。
Item 7: Avoid finalizers
Finalizers are unpredictable, often dangerous, and generally unnecessary.Their use can cause erratic behavior, poor performance, and portability problems. Finalizers have a few valid uses, which we’ll cover later in this item, but as a rule of thumb, you should avoid finalizers.
终结方法通常是不可预测的,经常是危险的,一般来说是没必要的。使用它们会引起不稳定的行为,性能变低,可移植性问题等。终结方法有一些有效的使用,这个在本条目的后面会讲到,但根据经验,你应该避免使用终结方法。
C++ programmers are cautioned not to think of finalizers as Java’s analog of C++ destructors. In C++, destructors are the normal way to reclaim the resources associated with an object, a necessary counterpart to constructors. In Java, the garbage collector reclaims the storage associated with an object when it becomes unreachable, requiring no special effort on the part of the programmer. C++ destructors are also used to reclaim other nonmemory resources. In Java, the try-finally
block is generally used for this purpose.
C++程序员被警告说不要去想像Java中模拟C++析构函数那样的终结方法。在C++中,析构函数是一种正常回收对象资源的方式,是构造函数的必要对应。在Java中,当对象不可访问时,垃圾回收器会回收对象的相关资源,不需要程序员进行专门的工作。C++析构函数也用来回收其它的非内存资源。在Java中,try-finally
块用来完成这样的功能。
One shortcoming of finalizers is that there is no guarantee they’ll be executed promptly [JLS, 12.6]. It can take arbitrarily long between the time that an object becomes unreachable and the time that its finalizer is executed. This means that you should never do anything time-critical in a finalizer. For example, it is a grave error to depend on a finalizer to close files, because open file descriptors are a limited resource. If many files are left open because the JVM is tardy in executing finalizers, a program may fail because it can no longer open files.
终结方法的一个缺点是不能保证它们及时的执行[JLS,12.6]。从对象变得不可访问开始到它的终结方法被执行结束,这中间的时间可以任意长。这意味着你不应该在终结方法中做任何时间为关键的事情。例如,依赖终结方法来关闭文件是一个严重的错误,因为开放的文件描述符是一种有限的资源。如果许多文件都是打开状态,由于JVM执行终结方法时是迟缓的,因此程序可能失败,因为它不能再打开文件。
The promptness with which finalizers are executed is primarily a function of the garbage collection algorithm, which varies widely from JVM implementation to JVM implementation. The behavior of a program that depends on the promptness of finalizer execution may likewise vary. It is entirely possible that such a program will run perfectly on the JVM on which you test it and then fail miserably on the JVM favored by your most important customer.
尽快执行终结方法是垃圾回收算法的主要功能,在不同的JVM实现中变化很大。依赖终结方法执行及时性的程序同样变化很大。一个程序在测试它的JVM上运行非常完美,但在你最重要客户支持的JVM上它却糟糕地运行失败了,这是完全有可能的。
Tardy finalization is not just a theoretical problem. Providing a finalizer for a class can, under rare conditions, arbitrarily delay reclamation of its instances. A colleague debugged a long-running GUI application that was mysteriously dying with an OutOfMemoryError
. Analysis revealed that at the time of its death, the application had thousands of graphics objects on its finalizer queue just waiting to be finalized and reclaimed. Unfortunately, the finalizer thread was running at a lower priority than another application thread, so objects weren’t getting finalized at the rate they became eligible for finalization. The language specification makes no guarantees as to which thread will execute finalizers, so there is no portable way to prevent this sort of problem other than to refrain from using finalizers.
迟缓终结不仅仅是一个理论问题。在很少的情况下,为一个类提供终结方法可能会随意地延迟它实例的回收。有个同事调试一个长期运行的GUI应用,程序莫名其妙的死掉了,抛出了OutOfMemoryError
错误。分析表明在程序死亡时,应用中的终结方法队列中有成千上万的图形对象在等待被终结并回收。遗憾的是,终结方法线程的运行优先级要低于另一个应用线程,因此在另一个应用线程中的对象变得可以被终结时,它们不能被终结。语言规范不能保证哪一个线程来执行终结方法,因此没有轻便的方式来阻止这种问题的发生,除非避免使用终结方法。
Not only does the language specification provide no guarantee that finalizers will get executed promptly; it provides no guarantee that they’ll get executed at all. It is entirely possible, even likely, that a program terminates without executing finalizers on some objects that are no longer reachable. As a consequence, you should never depend on a finalizer to update critical persistent state. For example, depending on a finalizer to release a persistent lock on a shared resource such as a database is a good way to bring your entire distributed system to a grinding halt.
不仅语言规范不能保证终结方法及时的执行;而且也不能保证终结方法得到执行。这完全有可能,甚至有可能一个程序终止时,一些不能访问的对象的终结方法都没有执行。结论就是:你从不该依赖终结方法来更新重要的持续状态。例如,依赖一个终结方法来释放一个共享资源,例如数据库,的持续锁,很容易引起整个分布式系统突然当掉。
Don’t be seduced by the methods System.gc
and System.runFinalization
. They may increase the odds of finalizers getting executed, but they don’t guarantee it. The only methods that claim to guarantee finalization are System.runFinalizersOnExit
and its evil twin, Runtime.runFinalizersOnExit
. These methods are fatally flawed and have been deprecated [ThreadStop].
不要被System.gc
和System.runFinalization
方法诱惑。它们可能会增加终结方法得到执行的几率,但它们不能保证它。能保证终结方法执行的唯一方法是System.runFinalizersOnExit
以及它臭名昭著的孪生兄弟Runtime.runFinalizersOnExit
。这些方法都有致命的缺陷并且已经被废弃了[ThreadStop]。
In case you are not yet convinced that finalizers should be avoided, here’s another tidbit worth considering: if an uncaught exception is thrown during finalization, the exception is ignored, and finalization of that object terminates [JLS, 12.6]. Uncaught exceptions can leave objects in a corrupt state. If another thread attempts to use such a corrupted object, arbitrary nondeterministic behavior may result. Normally, an uncaught exception will terminate the thread and print a stack trace, but not if it occurs in a finalizer—it won’t even print a warning.
以防你还不相信终结方法应该被避免,这儿有另一个情况值得思考:如果在终结方法执行期间抛出了一个无法捕获的异常,这个异常被忽略了,对象的终结方法终止了[JLS,12.6]。不能捕获的异常可能会使对象处于崩溃状态。如果另一个线程试图使用这样一个崩溃的对象,任何不确定性的行为都有可能发送。通常,一个未被捕获的异常会终止线程并打印栈轨迹,但如果它发生在一个终结方法中,将不会打印出警告。
Oh, and one more thing: there is a severe performance penalty for using finalizers. On my machine, the time to create and destroy a simple object is about 5.6 ns. Adding a finalizer increases the time to 2,400 ns. In other words, it is about 430 times slower to create and destroy objects with finalizers.
哦,还有一件事:使用终结方法会有严重的性能问题。在我的机器上,创建并销毁一个简单对象大约是5.6纳秒。添加一个终结方法会将这个时间增加到2400纳秒。换句话说,创建一个对象并用终结方法销毁对象比正常情况下大约慢了430倍。
So what should you do instead of writing a finalizer for a class whose objects encapsulate resources that require termination, such as files or threads? Just provide an explicit termination method, and require clients of the class to invoke this method on each instance when it is no longer needed. One detail worth mentioning is that the instance must keep track of whether it has been terminated: the explicit termination method must record in a private field that the object is no longer valid, and other methods must check this field and throw an IllegalStateException
if they are called after the object has been terminated.
因此当一个类的对象封装的资源需要结束时,你应该用什么来代替一个类的终结方法?例如文件或线程?提供一个显式的结束方法,当类的实例不再需要时,要求类的客户端在每个实例上都调用这个方法。一个值得提及的细节是,实例必须跟踪它是否已经被终结:显式的结束方法必须记录在一个私有字段中,这个字段表明对象不再有效,如果其它方法再对象终结后调用对象,其它方法必须检查这个字段并抛出IllegalStateException
。
Typical examples of explicit termination methods are the close methods on InputStream
, OutputStream
, and java.sql.Connection
. Another example is the cancel
method on java.util.Timer
, which performs the necessary state change to cause the thread associated with a Timer
instance to terminate itself gently. Examples from java.awt
include Graphics.dispose
and Window.dispose
. These methods are often overlooked, with predictably dire performance consequences. A related method is Image.flush
, which deallocates all the resources associated with an Image
instance but leaves it in a state where it can still be used, reallocating the resources if necessary.
显式结束方法的典型例子是InputStream
,OutputStream
和java.sql.Connection
的关闭方法。另一个例子是java.util.Timer
的cancel
方法,它会进行必要的状态检查并一起线程相关的Timer
实例平稳的结束它自己。java.awt
的例子包括Graphics.dispose
和Window.dispose
。这些方法经常被忽视,可以预料会引起可怕的性能后果。一个相关的方法是Image.flush
,它会释放所有Image
实例相关的资源,但会将实例保持在一个可用的状态,如果必要的时候重新分配资源。
Explicit termination methods are typically used in combination with the try-finally
construct to ensure termination. Invoking the explicit termination method inside the finally
clause ensures that it will get executed even if an exception is thrown while the object is being used:
显式结束方法通过与try-finally
结构结合来确保终结。在finally
语句块的内部调用显式的结束方法来确保它得到执行,即使对象使用时抛出了一个异常:
1 | // try-finally block guarantees execution of termination method |
So what, if anything, are finalizers good for? There are perhaps two legitimate uses. One is to act as a “safety net” in case the owner of an object forgets to call its explicit termination method. While there’s no guarantee that the finalizer will be invoked promptly, it may be better to free the resource late than never, in those (hopefully rare) cases when the client fails to call the explicit termination method. But the finalizer should log a warning if it finds that the resource has not been terminated, as this indicates a bug in the client code, which should be fixed. If you are considering writing such a safety-net finalizer, think long and hard about whether the extra protection is worth the extra cost.
那终结方法有什么好处呢?有两种可能的合法应用。一个是作为『安全网』,以防对象拥有者忘记调用它的显式结束方法。但这不能保证终结方法得到及时的调用,当客户端调用显式结束方法失败时,在那种情况下(希望很少),后面释放资源总比不释放资源要好。但终结方法如果发现资源仍没有被释放,它应该输出一个警告,因为这意味着客户端代码存在一个BUG,它应该被修正。如果你正在考虑写这样一个安全网终结方法,要仔细思考这种额外的保护是否值得额外的代价。
The four classes cited as examples of the explicit termination method pattern (FileInputStream
, FileOutputStream
, Timer
, and Connection
) have finalizers that serve as safety nets in case their termination methods aren’t called. Unfortunately these finalizers do not log warnings. Such warnings generally can’t be added after an API is published, as it would appear to break existing clients.
作为显式结束方法模式引用的四个例子(FileInputStream
,FileOutputStream
,Timer
和Connection
)都有终结方法作为安全网以防它们的结束方法没有被调用。遗憾的是这些终结方法不输出警告。这种警告通常在API发布后不能进行添加,因为它会损坏现有的客户端。
A second legitimate use of finalizers concerns objects with native peers. A native peer is a native object to which a normal object delegates via native methods. Because a native peer is not a normal object, the garbage collector doesn’t know about it and can’t reclaim it when its Java peer is reclaimed. A finalizer is an appropriate vehicle for performing this task, assuming the native peer holds no critical resources. If the native peer holds resources that must be terminated promptly, the class should have an explicit termination method, as described above. The termination method should do whatever is required to free the critical resource. The termination method can be a native method, or it can invoke one.
终结方法的第二个合法使用是关于对象的本地对等体。本地对等体是一个本地对象,普通对象通过本地方法委托给本地对象。由于本地对等体不是一个正常的对象,当它的Java对等体回收时,垃圾回收器不知道并且不能回收它。假设本地对等体不拥有重要的资源,终结方法是执行这个任务的合适工具。如果本地对等体拥有必须及时终止的资源,这个类应该有一个显式的结束方法,如上所述。结束方法应该用来释放重要资源。结束方法可以是一个本地方法或它可以调用一个本地方法。
It is important to note that “finalizer chaining” is not performed automatically. If a class (other than Object
) has a finalizer and a subclass overrides it, the subclass finalizer must invoke the superclass finalizer manually. You should finalize the subclass in a try
block and invoke the superclass finalizer in the corresponding finally
block. This ensures that the superclass finalizer gets executed even if the subclass finalization throws an exception and vice versa. Here’s how it looks. Note that this example uses the Override
annotation (@Override
), which was added to the platform in release 1.5. You can ignore Override
annotations for now, or see Item 36 to find out what they mean:
很重要的一点就是要注意『终结方法链』是不能自动执行的。如果一个类(不是Object
)有一个终结方法,一个子类覆写了它,子类终结方法必须手动调用父类终结方法。你应该try
块内终止这个子类并在对应的finally
块调用父类终结方法。这保证了父类终结方法得到了执行,即使子类终结方法抛出异常,反之亦然。下面是它的一个例子、注意这个例子使用了Override
注解(@Override
),在release 1.5版本中添加。现在你可以忽略Override
注解,或看Item 36弄明白它是什么意思:
1 |
|
If a subclass implementor overrides a superclass finalizer but forgets to invoke it, the superclass finalizer will never be invoked. It is possible to defend against such a careless or malicious subclass at the cost of creating an additional object for every object to be finalized. Instead of putting the finalizer on the class requiring finalization, put the finalizer on an anonymous class (Item 22) whose sole purpose is to finalize its enclosing instance. A single instance of the anonymous class, called a finalizer guardian, is created for each instance of the enclosing class. The enclosing instance stores the sole reference to its finalizer guardian in a private instance field so the finalizer guardian becomes eligible for finalization at the same time as the enclosing instance. When the guardian is finalized, it performs the finalization activity desired for the enclosing instance, just as if its finalizer were a method on the enclosing class:
如果一个子类实现者覆写了父类的终结方法但忘了调用它,父类终结方法将会从未调用。要注意防范这种粗心的或邪恶的子类是有可能的,代价就是为每个要被终结的对象创建一个额外的对象。代替将终结方法放在需要终结的类中,将终结方法放在一个匿名类中(Item 22),它的唯一目的就是终结它外围实例。匿名类的单个实例,被称为终结方法守护者,为外围类的每个实例创建一个匿名类实例。外围实例在一个私有字段存储了它的终结方法守护者的唯一引用,因此终结方法守护者与外围实例可以同时进行终结。当守护者被终结时,它会执行外围实例要求的终结活动,就像它的终结方法是外围实例的一个方法一样:
1 | // Finalizer Guardian idiom |
Note that the public class, Foo
, has no finalizer (other than the trivial one it inherits from Object
), so it doesn’t matter whether a subclass finalizer calls super.finalize
or not. This technique should be considered for every nonfinal public class that has a finalizer.
注意公有类Foo
没有终结方法(除非它从Object
继承一个无关紧要的),因此子类的终结方法是否调用super.finalize
是不重要的。每一个含有终结方法的非终结公有类都应该考虑这个技术。
In summary, don’t use finalizers except as a safety net or to terminate noncritical native resources. In those rare instances where you do use a finalizer, remember to invoke super.finalize
. If you use a finalizer as a safety net, remember to log the invalid usage from the finalizer. Lastly, if you need to associate a finalizer with a public, nonfinal class, consider using a finalizer guardian, so finalization can take place even if a subclass finalizer fails to invoke super.finalize
.
总结:不要使用终结方法,除非是用作安全网或用来终止一个非重要的本地资源。在那些你使用终结方法的稀少实例中,记住调用super.finalize
。如果你使用终结方法作为安全网,记住在终结方法中输出非法用法。最后,如果你需要将终结方法关联到一个公有的,非终结类,考虑使用终结方法守护者,即使子类终结方法调用super.finalize
失败,也会进行终结。
CHAPTER3 Methods Common to All Objects
ALTHOUGH Object
is a concrete class, it is designed primarily for extension. All of its nonfinal methods (equals
, hashCode
, toString
, clone
, and finalize
) have explicit general contracts because they are designed to be overridden. It is the responsibility of any class overriding these methods to obey their general contracts; failure to do so will prevent other classes that depend on the contracts (such as HashMap
and HashSet
) from functioning properly in conjunction with the class.
虽然Object
是一个具体的类,但设计它的主要目的是为了扩展。它的所有非final
方法(equals
,hashCode
,toString
,clone
和finalize
)都有明确的通用约定,因为设计它们的目的是为了重写。任何类都应该遵循通用约定重写这些方法;不这样做的话,依赖这些约定的其它类(例如HashMap
和HashSet
)将无法结合这个类正确运行。
This chapter tells you when and how to override the nonfinal Object
methods. The finalize
method is omitted from this chapter because it was discussed in Item 7. While not an Object
method, Comparable.compareTo
is discussed in this chapter because it has a similar character.
会告本章诉你什么时候,怎样重写这些非final的Object
方法。本章会忽略finalize
方法,因为它在Item 7中已经讨论过了。虽然不是一个Object
方法,但是这章仍会讨论Comparable.compareTo
,因为它有一个类似的特性。
Item 8: Obey the general contract when overriding equals
Overriding the equals
method seems simple, but there are many ways to get it wrong, and consequences can be dire. The easiest way to avoid problems is not to override the equals
method, in which case each instance of the class is equal only to itself. This is the right thing to do if any of the following conditions apply:
重写equals
方法看似简单,但许多方式都会导致错误,结果是非常可怕的。避免这些问题的最简单方式是不要重写equals
方法,在这种情况下类的每个实例只等价于它本身。如果符合以下任何条件,这样做就是正确的:
Each instance of the class is inherently unique. This is true for classes such as
Thread
that represent active entities rather than values. Theequals
implementation provided byObject
has exactly the right behavior for these classes.类的每个实例本质上都是唯一的。对于表示活动实体而不是表示值的类确实如此,例如
Thread
。对于这些类,Object
提供的equals
实现具有完全正确的行为。You don’t care whether the class provides a “logical equality” test. For example,
java.util.Random
could have overriddenequals
to check whether twoRandom
instances would produce the same sequence of random numbers going forward, but the designers didn’t think that clients would need or want this functionality. Under these circumstances, theequals
implementation inherited fromObject
is adequate.不关心类是否提供“逻辑等价”的测试。例如,
java.util.Random
可以重写equals
方法来检查两个Random
实例是否会产生相同的随机数序列,但设计者认为客户不需要或者不想要这个功能。在这种情况下,从Object
继承的equals
实现就足够了。A super class has already overridden
equals
,and the super class behavior is appropriate for this class. For example, mostSet
implementations inherit theirequals
implementation fromAbstractSet
,List
implementations fromAbstractList
, andMap
implementations fromAbstractMap
.超类已经重写了
equals
,超类的行为对于子类是合适的。例如,大多数Set
实现从AbstractSet
继承了equals
实现,List
实现从AbstractList
继承了equals
实现,Map
实现从AbstractMap
继承了equals
实现。The class is private or package-private,and you are certain that its
equals
method will never be invoked. Arguably, theequals
method should be overridden under these circumstances, in case it is accidentally invoked:类是私有的或包私有的,可以确定它的
equals
方法从不会被调用。可以说,在这些情况下equals
方法应该重写,以防它被偶然调用:
1 | public boolean equals(Object o) { |
So when is it appropriate to override Object.equals
? When a class has a notion of logical equality that differs from mere object identity, and a superclass has not already overridden equals
to implement the desired behavior. This is generally the case for value classes. A value class is simply a class that represents a value, such as Integer
or Date
. A programmer who compares references to value objects using the equals
method expects to find out whether they are logically equivalent, not whether they refer to the same object. Not only is overriding the equals
method necessary to satisfy programmer expectations; it enables instances to serve as map keys or set elements with predictable, desirable behavior.
什么时候重写Object.equals
方法是合适的?如果类具有逻辑等的概念,不同于对象同一性,并且超类没有重写equals
方法来实现要求的行为,这时候就需要重写equals
方法。这种情况通常是对值类而言的。值类仅仅是表示值的类,例如Integer
或Date
。程序员用equals
方法比较值对象的引用,期望找出它们是否是逻辑等价的,而不管它们是否是同一对象。重写equals
方法不仅满足了程序员的期望;它也能使实例作为映射表的主键或者集合的元素,使它们表现出可预期的行为。
One kind of value class that does not require the equals
method to be overridden is a class that uses instance control (Item 1) to ensure that at most one object exists with each value. Enum
types (Item 30) fall into this category. For these classes, logical equality is the same as object identity, so Object
’s equals
method functions as a logical equals method.
有一种不需要重写equals
方法的值类,它通过实例控制(Item 1)来确保每个值至多存在一个对象。枚举类型(Item 30)就是这种类。对于这种类而言,逻辑等价等同与对象同一性,Object
的equals
方法在功能上就如同逻辑等价方法。
When you override the equals
method, you must adhere to its general contract. Here is the contract, copied from the specification for Object
[JavaSE6]:
当你重写equals
方法时,你必须遵循通用约定。下面是约定内容,从Object
规范[JavaSE6]中拷贝的:
The equals
method implements an equivalence relation. It is:
Reflexive:For any non-null reference value
x
,x.equals(x)
must returntrue
.Symmetric:For any non-null reference values
x
andy
,x.equals(y)
must returntrue
if and only ify.equals(x)
returnstrue
.Transitive:For any non-null reference values
x
,y
,z
,ifx.equals(y)
returnstrue
andy.equals(z)
returnstrue
, thenx.equals(z)
must returntrue
.Consistent: For any non-null reference values
x
andy
, multiple invocations ofx.equals(y)
consistently returntrue
or consistently returnfalse
, provided no information used inequals
comparisons on the objects is modified.For any non-null reference value
x
,x.equals(null)
must returnfalse
.
equals
实现了一种等价关系。它是:
自反性:对于任何非空引用值
x
,x.equals(x)
必须返回true
。对称性:对于任何非空引用值
x
和y
,x.equals(y)
必须返回true
当且仅当y.equals(x)
返回true
。传递性:对于任何非空引用值,
x
,y
,z
,如果x.equals(y)
返回true
并且y.equals(z)
返回true
,则x.equals(z)
必须返回true
。一致性:对于任何非空引用值
x
和y
,x.equals(y)
的多次调用一致返回true
或一致返回false
,假设对象进行equals
比较时没有修改任何信息。对于非空引用值
x
,x.equals(null)
必须返回false
。
Unless you are mathematically inclined, this might look a bit scary, but do not ignore it! If you violate it, you may well find that your program behaves erratically or crashes, and it can be very difficult to pin down the source of the failure. To paraphrase John Donne, no class is an island. Instances of one class are frequently passed to another. Many classes, including all collections classes, depend on the objects passed to them obeying the equals
contract.
除非你擅长数学,否则这可能看起来有点可怕,但不要忽视它!如果你违反了它,你可能会发现你的程序表现不正常或程序崩溃,并且很难确定失败的来源。用John Donne的话来说,没有类是孤立的。一个类的实例频繁传递给另一个类。许多类,包括所有的集合类,都依赖于传递给它们的对象遵循equals
约定。
Now that you are aware of the dangers of violating the equals
contract, let’s go over the contract in detail. The good news is that, appearances notwithstanding, the contract really isn’t very complicated. Once you understand it, it’s not hard to adhere to it. Let’s examine the five requirements in turn:
现在你已经意识到了违反了equals
约定的危险,让我们详细回顾一下这个约定。好消息是实际上这个约定并不复杂,尽管从表面上来看不是这样。一旦你理解了它,遵循它并不难。让我们依次检查这五个要求:
Reflexivity—The first requirement says merely that an object must be equal to itself. It is hard to imagine violating this requirement unintentionally. If you were to violate it and then add an instance of your class to a collection, the collection’s contains
method might well say that the collection didn’t contain the instance that you just added.
自反性——第一个要求仅仅是说一个对象必须等价于它本身。很难想象会无意的违反这个要求。如果你违反了它并将你的类实例添加到一个集合中,集合的contains
方法可能会说这个集合中不包含你刚刚添加的实例。
Symmetry—The second requirement says that any two objects must agree on whether they are equal. Unlike the first requirement, it’s not hard to imagine violating this one unintentionally. For example, consider the following class, which implements a case-insensitive string. The case of the string is preserved by toString
but ignored in comparisons:
对称性——第二个要求是说任何两个对象必须对它们是否相等达成一致。不像第一个要求,不难想象会无意的违反这个要求。例如,考虑下面的类,它实现了不区分大小写的字符串。字符串保存在toString
中,但在比较时被忽略了:
1 | // Broken - violates symmetry! |
The well-intentioned equals
method in this class naively attempts to interoperate with ordinary strings. Let’s suppose that we have one case-insensitive string and one ordinary one:
这个类中,equals
方法的意图很好,单纯的想要与普通的字符串进行互操作。假设我们有一个区分大小写的字符串和一个普通的字符串:
1 | CaseInsensitiveString cis = new CaseInsensitiveString("Polish"); |
As expected, cis.equals(s)
returns true
. The problem is that while the equals
method in CaseInsensitiveString
knows about ordinary strings, the equals
method in String
is oblivious to case-insensitive strings. Therefore s.equals(cis)
returns false
, a clear violation of symmetry. Suppose you put a case-insensitive string into a collection:
正如预料的那样,cis.equals(s)
返回true
。问题是虽然CaseInsensitiveString
中的equals
知道普通的字符串,但是String
中的equals
方法不注意不区分大小写的字符串。因此s.equals(cis)
返回false
,这明显违反了对称性。假设你将一个不区分大小写的字符串放到一个集合中:
1 | List<CaseInsensitiveString> list = new ArrayList<CaseInsensitiveString>(); |
What does list.contains(s)
return at this point? Who knows? In Sun’s current implementation, it happens to return false
, but that’s just an implementation artifact. In another implementation, it could just as easily return true
or throw a runtime exception. Once you’ve violated the equals
contract, you simply don’t know how other objects will behave when confronted with your object.
这时list.contains(s)
会返回什么?谁知道呢?在Sun当前的实现中,它碰巧会返回false
,但那仅是一种实现方案。在另一种实现中,它也可能很容易的返回true
或抛出一个运行时异常。一旦你违反了equals
约定,当面对你的对象时,你根本不指定其它的对象行为会怎样。
To eliminate the problem, merely remove the ill-conceived attempt to interoperate with String
from the equals
method. Once you do this, you can refactor the method to give it a single return:
为了消除这个问题,只要从equals
方法中移除与String
进行交互的,考虑不周的尝试即可。一旦你这样做了,你可以重构这个方法给它一个返回即可:
1 |
|
Transitivity—The third requirement of the equals
contract says that if one object is equal to a second and the second object is equal to a third, then the first object must be equal to the third. Again, it’s not hard to imagine violating this requirement unintentionally. Consider the case of a subclass that adds a new value component to its superclass. In other words, the subclass adds a piece of information that affects equals
comparisons. Let’s start with a simple immutable two-dimensional integer point class:
传递性——equals
约定的第三个要求是说如果一个对象等价于第二个对象,而第二个对象等价于第三个对象,则第一个对象等价于第三个对象。同样的,不难想象会无意中违反这个要求。考虑这样一种情况,子类添加一个新的值组件到它的超类中。换句话说,子类添加的信息会影响equals
比较。以一个简单的不可变的二维整数点类作为开始:
1 | public class Point { |
Suppose you want to extend this class, adding the notion of color to a point:
假设你想扩展这个类,给点添加颜色的概念:
1 | public class ColorPoint extends Point { |
How should the equals
method look? If you leave it out entirely, the implementation is inherited from Point
and color information is ignored in equals
comparisons. While this does not violate the equals
contract, it is clearly unacceptable. Suppose you write an equals
method that returns true
only if its argument is another color point with the same position and color:
equals
方法应该看起来是怎样的?如果一点也不修改,直接从Point
继承equals
方法,在进行equals
比较时颜色信息会被忽略。虽然这没有违反equals
约定,但很明显这是不可接受的。假设你写了一个equals
方法,只有在它的参数是另一个有色点,且它们具有相同的位置和颜色时才返回true
:
1 | // Broken - violates symmetry! |
The problem with this method is that you might get different results when comparing a point to a color point and vice versa. The former comparison ignores color, while the latter comparison always returns false
because the type of the argument is incorrect. To make this concrete, let’s create one point and one color point:
这个方法的问题在于:当你比较一个普通点和一个有色点或相反的情况时,你可能会得到不同的结果。前者的比较忽略了颜色,而后者总是返回false
,因为参数类型不正确。为了使这个更具体一点,我们创建一个普通点和一个有色点:
1 | Point p = new Point(1, 2); |
Then p.equals(cp)
returns true
, while cp.equals(p)
returns false
. You might try to fix the problem by having ColorPoint.equals
ignore color when doing “mixed comparisons”:
p.equals(cp)
返回true
,而cp.equals(p)
返回false
。你可能想让ColorPoint.equals
进行比较混合比较时忽略颜色来修正这个问题:
1 | // Broken - violates transitivity! |
This approach does provide symmetry, but at the expense of transitivity:
这个方法提供了对称性,但违反了传递性:
1 | ColorPoint p1 = new ColorPoint(1, 2, Color.RED); |
Now p1.equals(p2)
and p2.equals(p3)
return true
, while p1.equals(p3)
returns false
, a clear violation of transitivity. The first two comparisons are “color-blind,” while the third takes color into account.
现在p1.equals(p2)
和p2.equals(p3)
返回true
,而p1.equals(p3)
返回false
,很明显这违反了传递性。前两个比较忽略了颜色,而第三个比较考虑了颜色。
So what’s the solution? It turns out that this is a fundamental problem of equivalence relations in object-oriented languages. There is no way to extend an instantiable class and add a value component while preserving the equals
contract, unless you are willing to forgo the benefits of object-oriented abstraction.
因此解决方案是什么?事实证明:在面向对象语言中,等价关系问题是一个基本的问题。当保留equals
约定时,你无法在扩展一个实例化的类的同时添加值组件,除非你愿意放弃面向对象抽象的优势。
You may hear it said that you can extend an instantiable class and add a value component while preserving the equals
contract by using a getClass
test in place of the instanceof
test in the equals
method:
你可能听说过你可以在equals
方法中通过使用getClass
测试代替instanceof
测试,从而在扩展一个可实例化的类并添加值组件的同时,保留equals
约定:
1 | // Broken - violates Liskov substitution principle (page 40) |
This has the effect of equating objects only if they have the same implementation class. While this may not seem so bad, the consequences are unacceptable.
当且仅当它们具有相同的实现类时,上面的代码在比较对象时才会有效。虽然这不是很糟糕,但结果是不可接受的。
Let’s suppose we want to write a method to tell whether an integer point is on the unit circle. Here is one way we could do it:
假设我们想写一个方法来判断一个整数点是否在单位圆上。下面是一种写法:
1 | // Initialize UnitCircle to contain all Points on the unit circle private static final Set<Point> unitCircle; |
While this may not be the fastest way to implement the functionality, it works fine. But suppose you extend Point
in some trivial way that doesn’t add a value component, say, by having its constructor keep track of how many instances have been created:
虽然这可能不是实现这个功能的最快方式,但它确实有效。但假设你以某种不添加值组件的方式扩展了Point
,例如通过它的构造函数来追踪创建了多少实例:
1 | public class CounterPoint extends Point { |
The Liskov substitution principle says that any important property of a type should also hold for its subtypes, so that any method written for the type should work equally well on its subtypes [Liskov87]. But suppose we pass a CounterPoint
instance to the onUnitCircle
method. If the Point
class uses a getClass
based equals
method, the onUnitCircle
method will return false
regardless of the CounterPoint
instance’s x
and y
values. This is so because collections, such as the HashSet
,used by the onUnitCircle
method, use the equals
method to test for containment, and no CounterPoint
instance is equal to any Point
. If, however, you use a proper instanceof
-based equals
method on Point
, the same onUnitCircle
method will work fine when presented with a CounterPoint
.
里氏替换原则认为,一个类型的任何重要属性也适用于它的子类型,因此该类型编写的任何方法在它的子类型中也都应该工作良好[Liskov87]。但假设我们给onUnitCircle
传递了一个CounterPoint
实例。如果Point
类使用了基于getClass
的equals
方法,onUnitCircle
将会返回false
,无论CounterPoint
实例的x
值和y
值是多少。这是因为集合,例如onUnitCircle
方法中的HashSet
,使用equals
方法来测试是否包含元素,没有CounterPoint
实例等于Point
。然而,如果你在Point
上使用合适的基于instanceof
的equals
方法,当面对CounterPoint
时,同样的onUnitCircle
方法会工作的很好。
While there is no satisfactory way to extend an instantiable class and add a value component, there is a fine workaround. Follow the advice of Item 16, “Favor composition over inheritance.” Instead of having ColorPoint
extend Point
, give ColorPoint
a private Point
field and a public view method (Item 5) that returns the point at the same position as this color point:
尽管没有令人满意的方式来扩展一个可实例化的类并添加值组件,但有一个很好的解决方案。遵循Item 16 “Favor composition over inheritance”的建议,不再让ColorPoint
继承Point
,而是通过在ColorPoint
中添加一个私有的Point
字段和一个公有的视图方法(Item 5),此方法返回一个与有色点具有相同位置的普通点:
1 | // Adds a value component without violating the equals contract |
There are some classes in the Java platform libraries that do extend an instantiable class and add a value component. For example, java.sql.Timestamp
extends java.util.Date
and adds a nanoseconds
field. The equals
implementation for Timestamp
does violate symmetry and can cause erratic behavior if Timestamp
and Date
objects are used in the same collection or are otherwise intermixed. The Timestamp
class has a disclaimer cautioning programmers against mixing dates and timestamps. While you won’t get into trouble as long as you keep them separate, there’s nothing to prevent you from mixing them, and the resulting errors can be hard to debug. This behavior of the Timestamp
class was a mistake and should not be emulated.
在Java平台库中有一些类扩展了一个可实例化的类并添加了一个值组件。例如,java.sql.Timestamp
扩展了java.util.Date
并添加了一个nanoseconds
字段。Timestamp
的equals
实现确实违反了对称性,如果Timestamp
和Date
用在同一个集合中或混杂在一起,会引起不稳定的行为。Timestamp
类有一个免责声明,警告程序员不要混合日期和时间戳。虽然只要你将它们分开就不会有麻烦,但是没有任何东西阻止你混合它们,而且产生的错误很难调试。Timestamp
类的这个行为是一个错误,不应该进行模仿。
Note that you can add a value component to a subclass of an abstract class without violating the equals
contract. This is important for the sort of class hierarchies that you get by following the advice in Item 20, “Prefer class hierarchies to tagged classes.” For example, you could have an abstract class Shape
with no value components, a subclass Circle
that adds a radius
field, and a subclass Rectangle
that adds length
and width
fields. Problems of the sort shown above won’t occur so long as it is impossible to create a superclass instance directly.
注意,你可以添加值组件到抽象类的子类而且不会违反equals
约定。对于遵循Item 20 “Prefer class hierarchies to tagged classes”的建议而得到这种类层次来说,这是非常重要的。例如,你可以有一个没有值组件的抽象类Shape
,子类Circle
添加了radius
字段,子类Rectangle
添加了length
和width
字段。只要不能直接创建一个超类实例,上面的种种问题就不会发生。
Consistency—The fourth requirement of the equals
contract says that if two objects are equal, they must remain equal for all time unless one (or both) of them is modified. In other words, mutable objects can be equal to different objects at different times while immutable objects can’t. When you write a class, think hard about whether it should be immutable (Item 15). If you conclude that it should, make sure that your equals
method enforces the restriction that equal objects remain equal and unequal objects remain unequal for all time.
一致性——equals
约定的第四个要求是说如果两个对象相等,它们必须一致相等,除非其中一个(或二者)被修改了。换句话说,可变对象在不同的时间可以等于不同的对象而不可变对象不能。当你写了一个类,仔细想想它是否应该是不可变的(Item 15)。如果你推断它应该是不可变的,那么要确保你的equals
方法满足这样的约束条件:相等的对象永远相等,不等的对象永远不等。
Whether or not a class is immutable, do not write an equals
method that depends on unreliable resources. It’s extremely difficult to satisfy the consistency requirement if you violate this prohibition. For example, java.net.URL
’s equals
method relies on comparison of the IP addresses of the hosts associated with the URLs. Translating a host name to an IP
address can require network access, and it isn’t guaranteed to yield the same results over time. This can cause the URL equals
method to violate the equals
contract and has caused problems in practice. (Unfortunately, this behavior cannot be changed due to compatibility requirements.) With very few exceptions, equals
methods should perform deterministic computations on memory-resident objects.
无论一个类是否是不可变的,都不要写一个依赖于不可靠资源的equals
方法。如果你违反了这个禁令,要满足一致性要求是非常困难的。例如,java.net.URL
的equals
方法依赖于对关联URL主机的IP地址的比较。将主机名转换成IP地址可能需要访问网络,随时间推移它不能保证取得相同的结果。这可能会导致URL equals
方法违反equals
约定并在实践中产生问题。(很遗憾,由于兼容性问题,这一行为不能被修改。)除了极少数例外,equals
方法应该对常驻内存对象进行确定性计算。
“Non-nullity”—The final requirement, which in the absence of a name I have taken the liberty of calling “non-nullity,” says that all objects must be unequal to null
. While it is hard to imagine accidentally returning true
in response to the invocation o.equals(null)
, it isn’t hard to imagine accidentally throwing a NullPointerException
. The general contract does not allow this. Many classes
have equals
methods that guard against this with an explicit test for null
:
“非空性”——最后的要求由于没有名字我称之为“非空性”,这个要求是说所有的对象都不等于null
。虽然很难想象调用o.equals(null)
会偶然的返回true
,但不难想象会意外抛出NullPointerException
的情况。通用约定不允许出现这种情况。许多类的equals
方法为了防止出现这种情况都进行对null
的显式测试:
1 |
|
This test is unnecessary. To test its argument for equality, the equals
method must first cast its argument to an appropriate type so its accessors may be invoked or its fields accessed. Before doing the cast, the method must use the instanceof
operator to check that its argument is of the correct type:
这个测试是没必要的。为了平等测试其参数,为了调用它的访问器或访问其字段,equals
方法首先必须将它的参数转换成合适的类型。在进行转换之前,equals
方法必须使用instanceof
操作符来检查它的参数是否是正确的类型:
1 |
|
If this type check were missing and the equals
method were passed an argument of the wrong type, the equals
method would throw a ClassCastException
, which violates the equals
contract. But the instanceof
operator is specified to return false
if its first operand is null
, regardless of what type appears in the second operand [JLS, 15.20.2]. Therefore the type check will return false
if null
is passed in, so you don’t need a separate null
check.
如果缺少类型检查,equals
方法传入了一个错误类型的参数,equals
方法会抛出ClassCastException
,这违反了equals
约定。但当指定instanceof
时,如果它的第一个操作数为null
,无论它的第二个操作数是什么类型,它都会返回false
[JLS, 15.20.2]。所以如果传入null
类型检查将会返回false
,因此你不必进行单独的null
检查。
Putting it all together, here’s a recipe for a high-quality equals
method:
Use the == operator to check if the argument is a reference to this object. If so, return
true
. This is just a performance optimization, but one that is worth doing if the comparison is potentially expensive.Use the
instanceof
operator to check if the argument has the correct type. If not, returnfalse
. Typically, the correct type is the class in which the method occurs. Occasionally, it is some interface implemented by this class. Use an interface if the class implements an interface that refines theequals
contract to permit comparisons across classes that implement the interface. Collection interfaces such asSet
,List
,Map
, andMap.Entry
have this property.Cast the argument to the correct type. Because this cast was preceded by an
instanceof
test, it is guaranteed to succeed.For each “significant” field in the class, check if that field of the argument matches the corresponding field of this object. If all these tests succeed, return
true
; otherwise, returnfalse
. If the type in step 2 is an interface, you must access the argument’s fields via interface methods; if the type is a class, you may be able to access the fields directly, depending on their accessibility.
将上面所有的内容放在一起,下面是编写一个高质量equals
方法的流程:
使用
==
操作符来检查参数是否是这个对象的一个引用,。如果是,返回true
。这只是一个性能优化,如果比较的代价有可能很昂贵,这样做是值得的。使用
instanceof
操作符来检查参数类型是否正确。如果不正确,返回false
。通常,正确的类型是指equals
方法所在的那个类。有时候,它是这个类实现的一些接口。如果一个类实现了一个接口,这个接口提炼了equals
约定来允许比较那些实现了这个接口类,那么就使用接口。集合接口例如Set
,List
,Map
和Map.Entry
都有这个属性。将参数转换成正确的类型。由于转换测试已经被
instanceof
在之前做了,因此它保证能成功。对于类中的每一个“有效”字段,检查参数的这个字段是否匹配这个对象的对应字段。如果所有的这些测试都成功了,返回
true
;否则返回false
。如果第二步中的类型是一个接口,你必须通过接口方法访问参数的字段;如果类型是一个类,你可能要直接访问字段,依赖于它们的可访问性。
For primitive fields whose type is not float
or double
, use the ==
operator for comparisons; for object reference fields, invoke the equals
method recursively; for float
fields, use the Float.compare
method; and for double
fields, use Double.compare
. The special treatment of float
and double
fields is made necessary by the existence of Float.NaN
, -0.0f
and the analogous double
constants; see the Float.equals
documentation for details. For array fields, apply these guidelines to each element. If every element in an array field is significant, you can use one of the Arrays.equals
methods added in release 1.5.
对于基本类型,如果不是float
或double
,使用==
操作符进行比较;对于对象引用字段,递归地调用equals
方法;对于float
自动,使用Float.compare
方法;对于double
字段,使用Double.compare
。float
和double
字段的特别对待是有必要的,因为存在Float.NaN
,-0.0f
和类似的double
常量;更多细节请看Float.equals
。对于数组字段,对每个元素应用这些指导。如果数组中的每个元素都是有意义的,你可以使用1.5版本中添加的Arrays.equals
方法。
Some object reference fields may legitimately contain null
. To avoid the possibility of a NullPointerException
, use this idiom to compare such fields:
某些对象引用字段可能合理的包含null
。为了避免产生NullPointerException
的可能性,使用下面的习惯用法来比较这些字段:
1 | (field == null ? o.field == null : field.equals(o.field)) |
This alternative may be faster if field
and o.field
are often identical:
如果field
和o.field
经常是等价的,使用下面的可替代方式可能会更快:
1 | (field == o.field || (field != null && field.equals(o.field))) |
For some classes, such as CaseInsensitiveString
above, field comparisons are more complex than simple equality tests. If this is the case, you may want to store a canonical form of the field, so the equals
method can do cheap exact comparisons on these canonical forms rather than more costly inexact comparisons. This technique is most appropriate for immutable classes (Item 15); if the object can change, you must keep the canonical form up to date.
对于某些类而言,例如上面的CaseInsensitiveString
,字段比较比简单的相等性检测更复杂。如果是这种情况,你可能想存储这个字段的标准形式,因此equals
方法可以在这些标准形式上进行低开销的精确比较,而不是更高代码的非精确比较。这种技术最适合不可变类(Item 15);如果对象可以改变,你必须保持最新的标准形式。
The performance of the equals
method may be affected by the order in which fields are compared. For best performance, you should first compare fields that are more likely to differ, less expensive to compare, or, ideally, both. You must not compare fields that are not part of an object’s logical state, such as Lock
fields used to synchronize operations. You need not compare redundant fields, which can be calculated from “significant fields,” but doing so may improve the performance of the equals
method. If a redundant field amounts to a summary description of the entire object, comparing this field will save you the expense of comparing the actual data if the comparison fails. For example, suppose you have a Polygon
class, and you cache the area. If two polygons have unequal areas, you needn’t bother comparing their edges and vertices.
equals
方法的性能可能会受到字段比较顺序的影响。为了最佳性能,你首先应该比较那些更可能不同,比较代价更小的字段,或者理想情况下二者兼具的字段。你不能比较那些不属于对象逻辑状态一部分的字段,例如同步操作中的Lock
字段。你也不需要比较冗余的字段,它们能从“有意义字段”中计算出来,但这样做可能会改善equals
方法的性能。如果冗余字段相当于整个对象的概要描述,比较这个字段,如果失败的话会节省你比较真正数据的开销。例如,假设你有一个Polygon
类,并且你缓存这个区域。如果两个多边形有不同的面积,你就不需要比较它们的边和顶点。
When you are finished writing your
equals
method, ask yourself three questions: Is it symmetric? Is it transitive? Is it consistent? And don’t just ask yourself; write unit tests to check that these properties hold! If they don’t, figure out why not, and modify theequals
method accordingly. Of course yourequals
method also has to satisfy the other two properties (reflexivity and “non-nullity”), but these two usually take care of themselves.当你完成了
equals
方法的编写时,问你自己三个问题:它是否是对称的?是否是可传递的?是否是一致的?并且不要只问你自己;编写单元测试来检查是否拥有这些属性!如果没有这些属性,弄清楚为什么没有,对应的修改equals
方法。当然你的equals
方法也必须满足其它两个属性(自反性和“非空性”),但这两个属性通常会自动满足。
For a concrete example of an equals
method constructed according to the above recipe, see PhoneNumber.equals
in Item 9. Here are a few final caveats:
根据上述规则构建的equals
方法具体例子请看Item 9的PhoneNumber.equals`。下面是一些最后的警告:
Always override
hashCode
when you overrideequals
(Item9).当你重写
equals
时,总是重写hashCode
方法(Item9)。Don’t try to be too clever. If you simply test fields for equality, it’s not hard to adhere to the
equals
contract. If you are overly aggressive in searching for equivalence, it’s easy to get into trouble. It is generally a bad idea to take any form of aliasing into account. For example, theFile
class shouldn’t attempt to equate symbolic links referring to the same file. Thankfully, it doesn’t.不要试图自作聪明。如果你简单的测试字段的相等性,不难遵循
equals
约定。如果过度的追求等价关系,很容易陷入到麻烦中。考虑任何形式的别名通常不是一个好想法。例如,File
类不应该试图把指向同名的符号链接看作相等。所幸它没有这样做。Don’t substitute another type for
Object
in theequals
declaration.It is not uncommon for a programmer to write anequals
method that looks like this, and then spend hours puzzling over why it doesn’t work properly:不要将
equals
声明中的Object
对象替换为其它对象。对于程序员来讲,写一个equals
方法看起来像下面的一样是不常见的,并且花费了好几个小时都不明白它为什么不能正确工作:
1 | public boolean equals(MyClass o) { |
The problem is that this method does not override Object.equals
, whose argument is of type Object
, but overloads it instead (Item 41). It is acceptable to provide such a “strongly typed” equals
method in addition to the normal one as long as the two methods return the same result, but there is no compelling reason to do so. It may provide minor performance gains under certain circumstances, but it isn’t worth the added complexity (Item 55).
这个问题在于这个方法没有重写Object.equals
方法,Object.equals
方法的参数类型是Object
,但相反,它重载了equals
方法(Item 41)。除了正常的equals
方法之外,提供这样一个“强类型”equals
方法是可接受的,只要这两个方法返回同样的结果,但没有令人信服的理由去这样做。在某些特定环境下它可能会提供很小的收益,但相对于增加的复杂性来讲是不值得的(Item 55)。
Consistent use of the @Override
annotation, as illustrated throughout this item, will prevent you from making this mistake (Item 36). This equals
method won’t compile and the error message will tell you exactly what is wrong:
正如本条目阐述的那样,@Override
注解的一致使用会阻止你犯这个错误(Item 36)。这个equals
方法不能编译并且错误信息会确切告诉你错误是什么。
1 |
|
Item 9: Always override hashCode when you override equals
A common source of bugs is the failure to override the hashCode
method. You must override hashCode
in every class that overrides equals
. Failure to do so will result in a violation of the general contract for Object.hashCode
, which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap
, HashSet
, and Hashtable
.
一个常见的错误来源是没有重写hashCode
方。在每个重写equals
方法的类中,你必须重写hashCode
方法。不这样做会违反Object.hashCode
的通用约定,这会使你的类不能在功能上与所有基于哈希的集合进行恰当的结合,包括HashMap
,HashSet
和Hashtable
。
Here is the contract, copied from the Object
specification [JavaSE6]:
下面是这些约定,从Object
规范中拷贝的[JavaSE6]:
Whenever it is invoked on the same object more than once during an execution of an application, the
hashCode
method must consistently return the same integer, provided no information used inequals
comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.假设同一个对象在进行
equals
比较时没有修改信息,那么在一个应用执行期间,无论什么时候对同一个对象调用多次hashCode
方法,它的hashCode
方法都必须返回一个一致的整数。这个整数在应用多次执行期间不必保持一致。If two objects are equal according to the
equals
(Object
) method, then calling thehashCode
method on each of the two objects must produce the same integer result.如果两个对象根据
equals
(Object
)方法是相等的,那么调用每一个对象的hashCode
方法必须产生同样的整数结果。It is not required that if two objects are unequal according to the
equals
(Object
) method, then calling thehashCode
method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.如果两个对象根据
equals
(Object
)方法不相等,不要求调用每一个对象的hashCode
方法必须产生同样的整数结果。然而,程序员应该意识到对于不等的对象产生不同的整数结果可能改善哈希表的性能。
The key provision that is violated when you fail to override hashCode
is the second one: equal objects must have equal hash codes. Two distinct instances may be logically equal according to a class’s equals
method, but to Object
’s hashCode
method, they’re just two objects with nothing much in common. Therefore Object
’s hashCode
method returns two seemingly random numbers instead of two equal numbers as required by the contract.
当不重写hashCode
时,违反的第二条是关键约定:相等对象必须具有相等的哈希值。两个不同的对象根据类的equals
方法可能在逻辑上是相等的,但对于Object
的hashCode
方法,它们是两个对象,没有共同的东西,因此Object
的hashCode
方法返回两个看似随机的数字来代替约定要求的相等数字。
For example, consider the following simplistic PhoneNumber
class, whose equals
method is constructed according to the recipe in Item 8:
例如,考虑下面简化的PhoneNumber
类,它的equals
方法是根据Item 8的流程构建的:
1 | public final class PhoneNumber { |
Suppose you attempt to use this class with a HashMap
:
假设你试图在HashMap
中使用这个类:
1 | Map<PhoneNumber, String> m = new HashMap<PhoneNumber, String>(); |
At this point, you might expect m.get(new PhoneNumber(707, 867, 5309))
to return “Jenny”, but it returns null. Notice that two PhoneNumber
instances are involved: one is used for insertion into the HashMap
, and a second, equal, instance is used for (attempted) retrieval. The PhoneNumber
class’s failure to override hashCode
causes the two equal instances to have unequal hash codes, in violation of the hashCode
contract. Therefore the get
method is likely to look for the phone number in a different hash bucket from the one in which it was stored by the put
method. Even if the two instances happen to hash to the same bucket, the get
method will almost certainly return null, as HashMap
has an optimization that caches the hash code associated with each entry and doesn’t bother checking for object equality if the hash codes don’t match.
这时候,你可能期待m.get(new PhoneNumber(707, 867, 5309))
返回Jenny
,但它返回空。注意涉及到两个PhoneNumber
实例:一个用来插入到HashMap
,第二个相等的实例用来(试图)检索。PhoneNumber
类没有重写hashCode
方法引起两个相等的实例有不等的哈希值,违反了hashCode
约定。因此get
方法可能在一个与put
方法储存的哈希桶不同的哈希桶中查找电话号码。即使两个实例碰到哈希到同一个桶中,get
几乎必定返回空,因为HashMap
缓存了每个输入相关的哈希吗,如果哈希码不匹配,不会检查对象的相等性。
Fixing this problem is as simple as providing a proper hashCode
method for the PhoneNumber
class. So what should a hashCode
method look like? It’s trivial to write one that is legal but not good. This one, for example, is always legal but should never be used:
修正这个问题很简单,为PhoneNumber
类提供一个合适的hashCode
方法。因此hashCode
方法应该看起来是什么样的?编写一个合法但不好的方法是没意义的。例如,下面的方法合法但从未被用到:
1 | // The worst possible legal hash function - never use! |
It’s legal because it ensures that equal objects have the same hash code. It’s atrocious because it ensures that every object has the same hash code. Therefore, every object hashes to the same bucket, and hash tables degenerate to linked lists. Programs that should run in linear time instead run in quadratic time. For large hash tables, this is the difference between working and not working.
它是合法的因为它保证了相等的对象有同样的哈希值。它是极差的因为它保证了每个对象都有同样的哈希值。因此,每个对象哈希到相同的桶中,哈希表退化成链表。程序从应该运行在线性时间内变成运行在平方时间内。对于打的哈希表,这是工作和不工作的区别。
A good hash function tends to produce unequal hash codes for unequal objects. This is exactly what is meant by the third provision of the hashCode
contract. Ideally, a hash function should distribute any reasonable collection of unequal instances uniformly across all possible hash values. Achieving this ideal can be difficult. Luckily it’s not too difficult to achieve a fair approximation. Here is a simple recipe:
一个好的哈希函数对于不等的对象趋向于产生不等的哈希值。这与hashCode
约定中的第三条是一个意思。理想情况下,一个哈希函数应该将任何合理的不等的实例集合,统一散列在所有可能的哈希值上。要取得这样的目标是非常困难的。幸运的是不难取得一个公平的近似。下面是简单的流程:
Store some constant nonzero value, say, 17, in an
int
variable calledresult
.For each significant field
f
in your object (each field taken into account by theequals
method, that is), do the following:
a. Compute an int
hash code c
for the field:
i. If the field is a boolean
, compute (f ? 1 : 0)
.
ii. If the field is a byte
, char
, short
, or int
, compute (int) f
.
iii. If the field is a long
, compute (int)(f^(f>>>32))
.
iv. If the field is a float
, compute Float.floatToIntBits(f)
.
v. If the field is a double
, compute Double.doubleToLongBits(f)
, and then hash the resulting long
as in step 2.a.iii.
vi. If the field is an object reference and this class’s equals
method compares the field by recursively invoking equals
, recursively invoke hashCode
on the field. If a more complex comparison is required, compute a “canonical representation” for this field and invoke hashCode
on the canonical representation. If the value of the field is null
, return 0
(or some other constant, but 0
is traditional).
vii. If the field is an array, treat it as if each element were a separate field. That is, compute a hash code for each significant element by applying these rules recursively, and combine these values per step 2.b. If every element in an array field is significant, you can use one of the Arrays.hashCode
methods added in release 1.5.
b. Combine the hash code c
computed in step 2.a into result as follows: result = 31 * result + c
;
Return result.
When you are finished writing the
hashCode
method, ask yourself whether equal instances have equal hash codes. Write unit tests to verify your intuition! If equal instances have unequal hash codes, figure out why and fix the problem.存储一些非零常量值,例如17,存储在变量名为
result
的int
变量中。对于对象中每一个有意义的字段
f
(每一个equals
方法考虑的字段),按以下做法去做:
a. 为这个字段计算一个int
型的哈希码c
:
i. 如果这个字段是一个boolean
,计算(f ? 1 : 0)
。
ii. 如果这个字段是一个byte
,char
,short
或int
,计算(int) f
。
iii. 如果这个字段是一个long
,计算(int)(f^(f>>>32))
。
iv. 如果这个字段是一个float
,计算Float.floatToIntBits(f)
。
v. 如果这个字段是一个double
,计算Double.doubleToLongBits(f)
,然后对结果long
进行2.a.iii处理。
vi. 如果这个字段是一个对象引用并且这个类的equals
方法通过递归调用equals
方法来比较这个字段,那么对这个字段递归的调用hashCode
方法。如果需要更复杂的比较,为这个字段计算一个“标准表示”然后在标准表示上调用hashCode
方法。如果字段值为null
,返回0
(或一些其它常量,但0
是传统表示).
vii. 如果字段是一个数组,将它每一个元素看做是一个单独的字段。也就是说,通过递归的应用这些规则为每一个有效元素计算一个哈希值,并结合这些值对每一个用步骤2.b处理。如果数组的每个元素都是有意义的,你可以用JDK 1.5中的Arrays.hashCode
方法。
b. 结合步骤2.a计算的哈希码c
得到结果如下:result = 31 * result + c
;
返回结果。
当你完成了
hashCode
方法的编写后,问一下自己相等的对象是否有相同的哈希码。写单元测试来验证你的直觉!如果相等的实例有不等的哈希码弄明白为什么并修正这个问题。
You may exclude redundant fields from the hash code computation. In other words, you may ignore any field whose value can be computed from fields included in the computation. You must exclude any fields that are not used in equals
comparisons, or you risk violating the second provision of the hashCode
contract.
你可以从哈希码计算中排除冗余字段。换句话说,你可以忽略那些可以从根据计算中的字段计算出值的字段。你必须排除那些equals
比较没有使用的字段,或者你冒险违反hashCode
约定中的第二条。
A nonzero initial value is used in step 1 so the hash value will be affected by initial fields whose hash value, as computed in step 2.a, is zero. If zero were used as the initial value in step 1, the overall hash value would be unaffected by any such initial fields, which could increase collisions. The value 17 is arbitrary.
步骤1中使用了一个非零初始值,因此哈希值会受到哈希值为0的最初字段的影响,最初字段的哈希值是在步骤2.a中计算的。如果0作为初始值在步骤1中使用,全部的哈希值将不受任何这样的最初字段的影响,这将会增加哈希碰撞。
The multiplication in step 2.b makes the result depend on the order of the fields, yielding a much better hash function if the class has multiple similar fields. For example, if the multiplication were omitted from a String
hash function, all anagrams would have identical hash codes. The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i
. Modern VMs do this sort of optimization automatically.
Let’s apply the above recipe to the PhoneNumber
class. There are three significant fields, all of type short:
步骤2.b中的乘积使结果依赖于字段的顺序,如果这个类有多个相似的字段会取得一个更好的哈希函数。例如,String
哈希函数忽略了乘积,所有的字母顺序将有相同的哈希码。选择值31是因为它是一个奇素数。如果它是偶数并且乘积溢出,会损失信息,因为与2想乘等价于位移运算。使用一个素数的优势不是那么明显,但习惯上都使用素数。31的一个很好的特性是乘积可以用位移和减法运算替换从而取得更好的性能:31 * i == (i << 5) - i
。现代的虚拟机能自动进行排序的优化。让我们对PhoneNumber
类应用上面的步骤。这儿有三个字段,所有的类型缩写:
1 | public int hashCode() { |
Because this method returns the result of a simple deterministic computation whose only inputs are the three significant fields in a PhoneNumber
instance, it is clear that equal PhoneNumber
instances have equal hash codes. This method is, in fact, a perfectly good hashCode
implementation for PhoneNumber
, on a par with those in the Java platform libraries. It is simple, reasonably fast, and does a reasonable job of dispersing unequal phone numbers into different hash buckets.
因为这个方法返回一个简单的确定性运算的结果,唯一的输入是PhoneNumber
实例中的三个有效字段,很明显相等的PhoneNumber
有相等的哈希值。事实上,这个方法对于PhoneNumber
来说是一个完美的很好的hashCode
实现,与Java平台库的实现是等价的。它是简单的,相当的快,做者合理的工作——将不等的电话号码分散到不同的哈希桶里。
If a class is immutable and the cost of computing the hash code is significant, you might consider caching the hash code in the object rather than recalculating it each time it is requested. If you believe that most objects of this type will be used as hash keys, then you should calculate the hash code when the instance is created. Otherwise, you might choose to lazily initialize it the first time hashCode
is invoked (Item 71). It is not clear that our PhoneNumber
class merits this treatment, but just to show you how it’s done:
如果一个类是不可变的,计算哈希码的代价是很明显的,你可能想缓存对象中的哈希码而不是每次请求时重新计算它。如果你认为这种类型的大多数对象将作为哈希键使用,那当实例创建时你应该计算哈希码。此外,当第一次调用hashCode
时(Item 71),你可以选择延迟初始化。我们的PhoneNumber
类进行这样处理的优点不是很明显,但可以显示一下它是怎么做的:
1 | // Lazily initialized, cached hashCode |
While the recipe in this item yields reasonably good hash functions, it does not yield state-of-the-art hash functions, nor do the Java platform libraries provide such hash functions as of release 1.6. Writing such hash functions is a research topic, best left to mathematicians and theoretical computer scientists. Perhaps a later release of the platform will provide state-of-the-art hash functions for its classes and utility methods to allow average programmers to construct such hash functions. In the meantime, the techniques described in this item should be adequate for most applications.
虽然在本条目中这些步骤取得了合理的好的哈希函数,但它不是最新的哈希函数,也不是Java 1.6平台库提供的哈希函数。写这样一个哈希函数是一个研究课题,最好留给数学家和理论科学家。也许Java平台后面的版本会为它的类和工具方法提供最新的哈希函数来允许普通的程序员构建这样的哈希函数。同时,本条目描述的技术应该足够满足大部分应用了。
Do not be tempted to exclude significant parts of an object from the hash code computation to improve performance. While the resulting hash function may run faster, its poor quality may degrade hash tables’ performance to the point where they become unusably slow. In particular, the hash function may, in practice, be confronted with a large collection of instances that differ largely in the regions that you’ve chosen to ignore. If this happens, the hash function will map all the instances to a very few hash codes, and hash-based collections will display quadratic performance. This is not just a theoretical problem. The String
hash function implemented in all releases prior to 1.2 examined at most sixteen characters, evenly spaced throughout the string, starting with the first character. For large collections of hierarchical names, such as URLs, this hash function displayed exactly the pathological behavior noted here.
不要试图将对象的有效部分排除在哈希码计算之外来提高性能。虽然最终结果的哈希函数可能运行更快,但它的质量很差可能会降低哈希表的性能,使哈希表变成慢的不可用的状态。尤其是在实践中,哈希函数可能面临在你选择忽略的区域中存在很大不同的实例集合。如果这种情况发生了,哈希函数会映射所有的实例到一个非常小的哈希码上,基于哈希的集合的性能将会变成平方级的。这不仅仅是一个理论问题。String
哈希函数在1.2之前的实现中,最多检查16个字符,整个字符串等间距,从第一个字符开始。对于名字分层的大集合,例如URLs,哈希函数正好展现了这里提到的病态行为。
Many classes in the Java platform libraries, such as String
, Integer
, and Date
, include in their specifications the exact value returned by their hashCode
method as a function of the instance value. This is generally not a good idea, as it severely limits your ability to improve the hash function in future releases. If you leave the details of a hash function unspecified and a flaw is found or a better hash function discovered, you can change the hash function in a subsequent release, confident that no clients depend on the exact values returned by the hash function.
Java平台库中的许多类,例如String
,Integer
和Date
,包含了类规范中它们的hashCode
方法返回的确定值。这通常不是一个好注意,因为它严重限制了你在将来版本中改进哈希函数的能力。如果没有指定哈希函数的细节,当发现有缺陷或一个更好的哈希函数时,你可以在接下来的版本中改变哈希函数,确信没有用户依赖哈希函数返回的确定值。
Item10: Always override toString
While java.lang.Object
provides an implementation of the toString
method, the string that it returns is generally not what the user of your class wants to see. It consists of the class name followed by an “at” sign (@) and the unsigned hexadecimal representation of the hash code, for example, “PhoneNumber@163b91.” The general contract for toString
says that the returned string should be “a concise but informative representation that is easy for a person to read” [JavaSE6]. While it could be argued that “PhoneNumber@163b91” is concise and easy to read, it isn’t very informative when compared to “(707) 867-5309.” The toString
contract goes on to say, “It is recommended that all subclasses override this method.” Good advice, indeed!
尽管java.lang.Object
提供了toString
方法的实现,但是通常情况下它返回的字符串不是使用类的用户想要的。返回的字符串包含类名,后面是一个@
符号加上哈希码的十六进制表示,例如PhoneNumber@163b91
。toString
的通用约定指出,返回值应该是“简洁但易读的信息表示”[JavaSE6]。虽然可以认为PhoneNumber@163b91
简洁易读,但它与(707) 867-5309
相比,它的信息不够丰富。toString
约定进一步指出,“建议所有的子类重写这个方法”。确实是个好建议。
While it isn’t as important as obeying the equals
and hashCode
contracts (Item 8, Item 9), providing a good toString
implementation makes your class much more pleasant to use. The toString
method is automatically invoked when an object is passed to println
, printf
, the string concatenation operator, or assert
, or printed by a debugger. (The printf
method was added to the platform in release 1.5, as were related methods including String.format
, which is roughly equivalent to C’s sprintf
.)
虽然它不像遵守equals
和hashCode
约定(Item 8, Item 9)那样重要,但是提供一个好的toString
实现可以使你的类用起来更舒适。当对象传到println
,printf
,字符串连接操作符,或assert
中,或通过调试器打印时,会自动调用toString
方法。(Java 1.5版本中平台加入了printf
方法,相关的方法包括String.format
,类似于C语言中的sprintf
方法)。
If you’ve provided a good toString
method for PhoneNumber
, generating a useful diagnostic message is as easy as this:
如果你已经为PhoneNumber
提供了一个好的toString
方法,生成有用的诊断信息是很容易的:
1 | System.out.println("Failed to connect: " + phoneNumber); |
Programmers will generate diagnostic messages in this fashion whether or not you override toString
, but the messages won’t be useful unless you do. The benefits of providing a good toString
method extend beyond instances of the class to objects containing references to these instances, especially collections. Which would you rather see when printing a map, “{Jenny=PhoneNumber@163b91}” or “{Jenny=(707) 867-5309}”?
无论你是否重写toString
方法,程序员们都会以这种方式生成诊断信息,但除非你重写了toString
方法,否则这些信息是无用的。提供一个好的toString
方法的好处是除了类的实例之外,也扩展了包含这些实例引用的对象,尤其是集合。当打印一个映射时,{Jenny=PhoneNumber@163b91}
或{Jenny=(707) 867-5309}
你更喜欢哪一个?
When practical, the toString
method should return all of the interesting information contained in the object, as in the phone number example just shown. It is impractical if the object is large or if it contains state that is not conducive to string representation. Under these circumstances, toString
should return a summary such as “Manhattan white pages (1487536 listings)” or “Thread[main,5,main]”. Ideally, the string should be self-explanatory. (The Thread
example flunks this test.)
当实践时,toString
方法应该返回包含在对象中的所有的感兴趣信息,正如刚才电话号码的例子展示的那样。如果对象很大或它包含不能用字符串表示的状态,重写toString
方法是不切实际的。在这种情况下,toString
应该返回一个概要信息,例如Manhattan white pages (1487536 listings)
或Thread[main,5,main]
。理想情况下,字符串应该是自解释的。(Thread
例子不能满足这样的要求。)
One important decision you’ll have to make when implementing a toString
method is whether to specify the format of the return value in the documentation. It is recommended that you do this for value classes, such as phone numbers or matrices. The advantage of specifying the format is that it serves as a standard, unambiguous, human-readable representation of the object. This representation can be used for input and output and in persistent human-readable data objects, such as XML documents. If you specify the format, it’s usually a good idea to provide a matching static factory or constructor so programmers can easily translate back and forth between the object and its string representation. This approach is taken by many value classes in the Java platform libraries, including BigInteger
, BigDecimal
, and most of the boxed primitive classes.
当实现toString
时,你要做的一个重要决定是是否在文档中指定返回值的格式。对于值类建议你这样做,例如电话号码或矩阵。指定返回值格式的优势在于它能为对象提供一个标准的,清晰的,可读的表示。这个表示可以用在输入输出中,也可以用在一致的可读数据对象中,例如XML文档。如果你指定了格式,提供一个匹配的静态工厂或构造函数通常是一个好主意,程序员可以很容易地在对象和它的字符串表示之间来回转换。Java平台库中许多值类都采用了这个方法,包括BigInteger
,BigDecimal
和大多数基本类型的包装类。
The disadvantage of specifying the format of the toString
return value is that once you’ve specified it, you’re stuck with it for life, assuming your class is widely used. Programmers will write code to parse the representation, to generate it, and to embed it into persistent data. If you change the representation in a future release, you’ll break their code and data, and they will yowl. By failing to specify a format, you preserve the flexibility to add information or improve the format in a subsequent release.
指定toString
返回值格式的劣势在于一旦你指定了它,假设你的类被广泛使用,你就必须一直坚持它。程序员将会写代码转换这种表示,产生这种格式并将它嵌入到持久化数据中。如果你在将来的版本中更改了表示格式,你将会破坏他们的代码和数据,他们将会抱怨。如果你没有指定格式,你保留了添加信息的灵活性或者在后续版本改进这种格式。
Whether or not you decide to specify the format, you should clearly document your intentions. If you specify the format, you should do so precisely. For example, here’s a toString
method to go with the PhoneNumber
class in Item 9:
无论你决定是否指定格式,你都应该清楚地表明你的意图。如果你指定了格式,你应该准确的去做。例如,下面的Item 9中PhoneNumber
类的toString
方法:
1 | /** |
If you decide not to specify a format, the documentation comment should read something like this:
如果你没有指定格式,文档注释读起来应该如下:
1 | /** |
After reading this comment, programmers who produce code or persistent data that depends on the details of the format will have no one but themselves to blame when the format is changed.
写代码或持久化数据的依赖于格式细节的程序员,在读了这个文档之后,一旦格式改变,只能自己负责后果。
Whether or not you specify the format, provide programmatic access to all of the information contained in the value returned by toString
. For example, the PhoneNumber
class should contain accessors for the area code, prefix, and line number. If you fail to do this, you force programmers who need this information to parse the string. Besides reducing performance and making unnecessary work for programmers, this process is error-prone and results in fragile systems that break if you change the format. By failing to provide accessors, you turn the string format into a de facto API, even if you’ve specified that it’s subject to change.
无论你是否指定了格式,都应该提供toString
返回值中包含的所有信息的程序访问接口。例如,PhoneNumber
类应该包含区域码,前缀和行号的访问器。如果你没有这样做,你会迫使需要这个信息的程序员取转换这个字符串。除了为程序员降低效率和造成不必要的工作之外,这个过程中很容易出错,而且会导致系统非常脆弱,如果你更改了格式系统会崩溃。如果没有提供访问器,即使你指明了字符串格式是可以变化的,这个字符串格式也变成了实际上的API。
Item11: Override clone judiciously
The Cloneable
interface was intended as a mixin interface (Item 18) for objects to advertise that they permit cloning. Unfortunately, it fails to serve this purpose. Its primary flaw is that it lacks a clone method, and Object’s clone method is pro- tected. You cannot, without resorting to reflection (Item 53), invoke the clone method on an object merely because it implements Cloneable. Even a reflective invocation may fail, as there is no guarantee that the object has an accessible clone method. Despite this flaw and others, the facility is in wide use so it pays to understand it. This item tells you how to implement a well-behaved clone method, discusses when it is appropriate to do so, and presents alternatives.
So what does Cloneable do, given that it contains no methods? It determines the behavior of Object’s protected clone implementation: if a class implements Cloneable, Object’s clone method returns a field-by-field copy of the object; otherwise it throws CloneNotSupportedException. This is a highly atypical use of interfaces and not one to be emulated. Normally, implementing an interface says something about what a class can do for its clients. In the case of Cloneable, it modifies the behavior of a protected method on a superclass.
If implementing the Cloneable interface is to have any effect on a class, the class and all of its superclasses must obey a fairly complex, unenforceable, and thinly documented protocol. The resulting mechanism is extralinguistic: it creates an object without calling a constructor.
The general contract for the clone method is weak. Here it is, copied from the specification for java.lang.Object [JavaSE6]:
Creates and returns a copy of this object. The precise meaning of “copy” may depend on the class of the object. The general intent is that, for any object x, the expression
x.clone() != x
will be true, and the expression
x.clone().getClass() == x.getClass()
will be true, but these are not absolute requirements. While it is typically the
case that
x.clone().equals(x)
will be true, this is not an absolute requirement. Copying an object will typi- cally entail creating a new instance of its class, but it may require copying of internal data structures as well. No constructors are called.
There are a number of problems with this contract. The provision that “no constructors are called” is too strong. A well-behaved clone method can call constructors to create objects internal to the clone under construction. If the class is final, clone can even return an object created by a constructor.
The provision that x.clone().getClass() should generally be identical to x.getClass(), however, is too weak. In practice, programmers assume that if they extend a class and invoke super.clone from the subclass, the returned object will be an instance of the subclass. The only way a superclass can provide this functionality is to return an object obtained by calling super.clone. If a clone method returns an object created by a constructor, it will have the wrong class. Therefore, if you override the clone method in a nonfinal class, you should return an object obtained by invoking super.clone. If all of a class’s super- classes obey this rule, then invoking super.clone will eventually invoke Object’s clone method, creating an instance of the right class. This mechanism is vaguely similar to automatic constructor chaining, except that it isn’t enforced.
The Cloneable interface does not, as of release 1.6, spell out in detail the responsibilities that a class takes on when it implements this interface. In practice, a class that implements Cloneable is expected to provide a properly functioning public clone method. It is not, in general, possible to do so unless all of the class’s superclasses provide a well-behaved clone implementation, whether public or protected.
Suppose you want to implement Cloneable in a class whose superclasses pro- vide well-behaved clone methods. The object you get from super.clone() may or may not be close to what you’ll eventually return, depending on the nature of the class. This object will be, from the standpoint of each superclass, a fully func- tional clone of the original object. The fields declared in your class (if any) will have values identical to those of the object being cloned. If every field contains a primitive value or a reference to an immutable object, the returned object may be exactly what you need, in which case no further processing is necessary. This is the case, for example, for the PhoneNumber class in Item 9. In this case, all you need do in addition to declaring that you implement Cloneable is to provide pub- lic access to Object’s protected clone method:
1 | public PhoneNumber clone() { |
Note that the above clone method returns PhoneNumber, not Object. As of release 1.5, it is legal and desirable to do this, because covariant return types were introduced in release 1.5 as part of generics. In other words, it is now legal for an overriding method’s return type to be a subclass of the overridden method’s return type. This allows the overriding method to provide more information about the returned object and eliminates the need for casting in the client. Because Object.clone returns Object, PhoneNumber.clone must cast the result of super.clone() before returning it, but this is far preferable to requiring every caller of PhoneNumber.clone to cast the result. The general principle at play here is never make the client do anything the library can do for the client.
If an object contains fields that refer to mutable objects, using the simple clone implementation shown above can be disastrous. For example, consider the Stack class in Item 6:
1 | public class Stack { |
Suppose you want to make this class cloneable. If its clone method merely returns super.clone(), the resulting Stack instance will have the correct value in its size field, but its elements field will refer to the same array as the original Stack instance. Modifying the original will destroy the invariants in the clone and vice versa. You will quickly find that your program produces nonsensical results or throws a NullPointerException.
This situation could never occur as a result of calling the sole constructor in the Stack class. In effect, the clone method functions as another constructor; you must ensure that it does no harm to the original object and that it prop- erly establishes invariants on the clone. In order for the clone method on Stack to work properly, it must copy the internals of the stack. The easiest way to do this is to call clone recursively on the elements array:
1 | public Stack clone() { |
Note that we do not have to cast the result of elements.clone() to Object[]. As of release 1.5, calling clone on an array returns an array whose compile-time type is the same as that of the array being cloned.
Note also that the above solution would not work if the elements field were final, because clone would be prohibited from assigning a new value to the field. This is a fundamental problem: the clone architecture is incompatible with normal use of final fields referring to mutable objects, except in cases where the mutable objects may be safely shared between an object and its clone. In order to make a class cloneable, it may be necessary to remove final modifiers from some fields.
It is not always sufficient to call clone recursively. For example, suppose you are writing a clone method for a hash table whose internals consist of an array of buckets, each of which references the first entry in a linked list of key-value pairs or is null if the bucket is empty. For performance, the class implements its own lightweight singly linked list instead of using java.util.LinkedList internally:
1 | public class HashTable implements Cloneable { |
Suppose you merely clone the bucket array recursively, as we did for Stack:
1 | // Broken - results in shared internal state! |
Though the clone has its own bucket array, this array references the same linked lists as the original, which can easily cause nondeterministic behavior in both the clone and the original. To fix this problem, you’ll have to copy the linked list that comprises each bucket individually. Here is one common approach:
1 | public class HashTable implements Cloneable { |
The private class HashTable.Entry has been augmented to support a “deep copy” method. The clone method on HashTable allocates a new buckets array of the proper size and iterates over the original buckets array, deep-copying each nonempty bucket. The deep-copy method on Entry invokes itself recursively to copy the entire linked list headed by the entry. While this technique is cute and works fine if the buckets aren’t too long, it is not a good way to clone a linked list because it consumes one stack frame for each element in the list. If the list is long, this could easily cause a stack overflow. To prevent this from happening, you can replace the recursion in deepCopy with iteration:
1 | // Iteratively copy the linked list headed by this Entry |
A final approach to cloning complex objects is to call super.clone, set all of the fields in the resulting object to their virgin state, and then call higher-level methods to regenerate the state of the object. In the case of our HashTable example, the buckets field would be initialized to a new bucket array, and the put(key, value) method (not shown) would be invoked for each key-value map- ping in the hash table being cloned. This approach typically yields a simple, rea- sonably elegant clone method that generally doesn’t run quite as fast as one that directly manipulates the innards of the object and its clone.
Like a constructor, a clone method should not invoke any nonfinal methods on the clone under construction (Item 17). If clone invokes an overridden method, this method will execute before the subclass in which it is defined has had a chance to fix its state in the clone, quite possibly leading to corruption in the clone and the original. Therefore the put(key, value) method discussed in the previ- ous paragraph should be either final or private. (If it is private, it is presumably the “helper method” for a nonfinal public method.)
Object’s clone method is declared to throw CloneNotSupportedException, but overriding clone methods can omit this declaration. Public clone methods should omit it because methods that don’t throw checked exceptions are easier to use (Item 59). If a class that is designed for inheritance (Item 17) overrides clone, the overriding method should mimic the behavior of Object.clone: it should be declared protected, it should be declared to throw CloneNotSupportedExcep- tion, and the class should not implement Cloneable. This gives subclasses the freedom to implement Cloneable or not, just as if they extended Object directly.
One more detail bears noting. If you decide to make a thread-safe class imple- ment Cloneable, remember that its clone method must be properly synchronized just like any other method (Item 66). Object’s clone method is not synchronized, so even if it is otherwise satisfactory, you may have to write a synchronized clone method that invokes super.clone().
To recap, all classes that implement Cloneable should override clone with a public method whose return type is the class itself. This method should first call super.clone and then fix any fields that need to be fixed. Typically, this means copying any mutable objects that comprise the internal “deep structure” of the object being cloned, and replacing the clone’s references to these objects with ref- erences to the copies. While these internal copies can generally be made by call- ing clone recursively, this is not always the best approach. If the class contains only primitive fields or references to immutable objects, then it is probably the case that no fields need to be fixed. There are exceptions to this rule. For example, a field representing a serial number or other unique ID or a field representing the object’s creation time will need to be fixed, even if it is primitive or immutable.
Is all this complexity really necessary? Rarely. If you extend a class that implements Cloneable, you have little choice but to implement a well-behaved clone method. Otherwise, you are better off providing an alternative means of object copying, or simply not providing the capability. For example, it doesn’t make sense for immutable classes to support object copying, because copies would be virtually indistinguishable from the original.
A fine approach to object copying is to provide a copy constructor or copy factory. A copy constructor is simply a constructor that takes a single argument whose type is the class containing the constructor, for example,
1 | public Yum(Yum yum); |
A copy factory is the static factory analog of a copy constructor:
1 | public static Yum newInstance(Yum yum); |
The copy constructor approach and its static factory variant have many advantages over Cloneable/clone: they don’t rely on a risk-prone extralinguistic object creation mechanism; they don’t demand unenforceable adherence to thinly documented conventions; they don’t conflict with the proper use of final fields; they don’t throw unnecessary checked exceptions; and they don’t require casts. While it is impossible to put a copy constructor or factory in an interface, Cloneable fails to function as an interface because it lacks a public clone method. Therefore you aren’t giving up interface functionality by using a copy constructor or factory in preference to a clone method.
Furthermore, a copy constructor or factory can take an argument whose type is an interface implemented by the class. For example, by convention all general-purpose collection implementations provide a constructor whose argument is of type Collection or Map. Interface-based copy constructors and factories, more properly known as conversion constructors and conversion factories, allow the client to choose the implementation type of the copy rather than forcing the client to accept the implementation type of the original. Suppose you have a HashSet s, and you want to copy it as a TreeSet. The clone method can’t offer this function- ality, but it’s easy with a conversion constructor: new TreeSet(s).
Given all of the problems associated with Cloneable, it’s safe to say that other interfaces should not extend it, and that classes designed for inheritance (Item 17) should not implement it. Because of its many shortcomings, some expert programmers simply choose never to override the clone method and never to invoke it except, perhaps, to copy arrays. If you design a class for inheritance, be aware that if you choose not to provide a well-behaved protected clone method, it will be impossible for subclasses to implement Cloneable.